I've found some suggestions online.
I have a similar problem but none of the suggestions helped (or I did not figure out correctly how to implement them according to my program).
The code is inserted as asm(...)
in a C program.
After compiling with -masm=intel
, when using:
asm ("FLD EBX \n" "FSQRT \n" "FST EBX \n").
I get the compilation error:
"Error: operand type mismatch for 'fld'" "... mismatch for 'fst'".
EBX holds some integer positive value before these commands.
So what is the correct way of getting ebx = sqrt(ebx)?
You should use SSE / SSE2 for sqrt in modern code, not x87. You can directly convert an integer in a gp register to a double in an xmm register with one instruction.
cvtsi2sd xmm0, ebx
sqrtsd xmm0, xmm0 ; sd means scalar double, as opposed to SIMD packed double
cvttsd2si ebx, xmm0 ; convert with truncation (C-style cast)
; cvtsd2si ecx, xmm0 ; rounded to nearest integer (or whatever the current rounding mode is)
This works for 64bit integers, too (rbx
), but note that double
can only exactly represent integers up to about 2^53 (mantissa size). If you want to check if an integer is a perfect square, you can use float sqrt and then do a trial multiplication of the integer result. ((a*a) == b
)
See the x86 for links to guides, tutorials, and manuals.
Note that inserting this code into the middle of a C program is completely the wrong approach. GNU C inline asm is the hardest way to do asm, because you have to really understand everything to get the constraints right. Getting them wrong can lead to other surrounding code breaking in subtle and hard-to-debug ways, rather than just the thing you're doing with inline asm being wrong. See the x86 tag wiki for more detail about this.
If you want int a = sqrt((int)b)
, then write that in your code and let the compiler generate those three instructions for you. By all means read and understand the compiler output, but don't just blindly plop a sequence in the middle of it with asm("")
.
e.g.:
#include <math.h>
int isqrt(int a) { return sqrt(a); }
compiles to (gcc 5.3 without -ffast-math):
pxor xmm0, xmm0 # D.2569
cvtsi2sd xmm0, edi # D.2569, a
sqrtsd xmm1, xmm0 # tmp92, D.2569
ucomisd xmm1, xmm1 # tmp92, tmp92
jp .L7 #,
cvttsd2si eax, xmm1 # D.2570, tmp92
ret
.L7:
sub rsp, 8 #,
call sqrt #
add rsp, 8 #,
cvttsd2si eax, xmm0 # D.2570, tmp92
ret
I guess sqrt()
has to set errno on some kinds of errors. :/
With -fno-math-errno
:
pxor xmm0, xmm0 # D.2569
cvtsi2sd xmm0, edi # D.2569, a
sqrtsd xmm0, xmm0 # tmp92, D.2569
cvttsd2si eax, xmm0 # D.2570, tmp92
ret
The pxor
is to break the false dependency on the previous contents of xmm0, because cvtsi2sd
made the strange design decision to leave the upper half of the dest vector reg unmodified. That's only useful if you want to insert the conversion result into an existing vector, but there's already cvtdq2pd
to do a packed conversion. (And they probably didn't have 64bit integers in mind, since AMD64 was still on the drawing board when Intel released SSE2).