Search code examples
assemblycompiler-errorsx86mismatch

Sqrt in Assembly x86


I've found some suggestions online.

I have a similar problem but none of the suggestions helped (or I did not figure out correctly how to implement them according to my program).

The code is inserted as asm(...) in a C program.

After compiling with -masm=intel, when using:

asm ("FLD EBX \n" "FSQRT \n" "FST EBX \n").

I get the compilation error:

"Error: operand type mismatch for 'fld'" "... mismatch for 'fst'".

EBX holds some integer positive value before these commands.

So what is the correct way of getting ebx = sqrt(ebx)?


Solution

  • You should use SSE / SSE2 for sqrt in modern code, not x87. You can directly convert an integer in a gp register to a double in an xmm register with one instruction.

    cvtsi2sd  xmm0, ebx
    sqrtsd    xmm0, xmm0     ; sd means scalar double, as opposed to SIMD packed double
    cvttsd2si  ebx, xmm0     ; convert with truncation (C-style cast)
    
    ; cvtsd2si  ecx, xmm0    ; rounded to nearest integer (or whatever the current rounding mode is)
    

    This works for 64bit integers, too (rbx), but note that double can only exactly represent integers up to about 2^53 (mantissa size). If you want to check if an integer is a perfect square, you can use float sqrt and then do a trial multiplication of the integer result. ((a*a) == b)

    See the for links to guides, tutorials, and manuals.


    Note that inserting this code into the middle of a C program is completely the wrong approach. GNU C inline asm is the hardest way to do asm, because you have to really understand everything to get the constraints right. Getting them wrong can lead to other surrounding code breaking in subtle and hard-to-debug ways, rather than just the thing you're doing with inline asm being wrong. See the x86 tag wiki for more detail about this.

    If you want int a = sqrt((int)b), then write that in your code and let the compiler generate those three instructions for you. By all means read and understand the compiler output, but don't just blindly plop a sequence in the middle of it with asm("").

    e.g.:

    #include <math.h>
    int isqrt(int a) { return sqrt(a); }
    

    compiles to (gcc 5.3 without -ffast-math):

        pxor    xmm0, xmm0      # D.2569
        cvtsi2sd        xmm0, edi       # D.2569, a
        sqrtsd  xmm1, xmm0  # tmp92, D.2569
        ucomisd xmm1, xmm1        # tmp92, tmp92
        jp      .L7 #,
        cvttsd2si       eax, xmm1     # D.2570, tmp92
        ret
    .L7:
        sub     rsp, 8    #,
        call    sqrt    #
        add     rsp, 8    #,
        cvttsd2si       eax, xmm0     # D.2570, tmp92
        ret
    

    I guess sqrt() has to set errno on some kinds of errors. :/

    With -fno-math-errno:

        pxor    xmm0, xmm0      # D.2569
        cvtsi2sd        xmm0, edi       # D.2569, a
        sqrtsd  xmm0, xmm0  # tmp92, D.2569
        cvttsd2si       eax, xmm0     # D.2570, tmp92
        ret
    

    The pxor is to break the false dependency on the previous contents of xmm0, because cvtsi2sd made the strange design decision to leave the upper half of the dest vector reg unmodified. That's only useful if you want to insert the conversion result into an existing vector, but there's already cvtdq2pd to do a packed conversion. (And they probably didn't have 64bit integers in mind, since AMD64 was still on the drawing board when Intel released SSE2).