Laksen, thanks for explaining it. I'm looking at the generated assembly:
addl $1,%r9d
movslq %ebx,%rax
movslq %r9d,%r8
cqto
idivq %r8
testq %rdx,%rdx
jne .Lj13
So it does seem to be using "idivq" instruction. However, if looking at the same code for x64 target from
gcc/clang, they seem to generate the following:
mov eax, ebx
cdq
idiv ecx
test edx, edx
je .L5
I suppose since they use "idiv" instead of "idivq", it might be faster, is there any way to tell FreePascal to do that?
P.S. Using (int64_t)x % (int64_t)y in gcc/clang
still seem to be using "idiv" instead of "idivq"?