Search code examples
assemblyx86-64intelprecisionsse

Intel x86_64 assembly compare signed double precision floats


I've got a problem according to subject.

In xmm0 register I have a value, e.g. -512.000000
And in xmm4: 0.000000.

I try to compare the first value with zero and I cannot really achieve this.

comisd xmm0, xmm4

COMISD instruction sets flags in a strange way, and only jnz after that works properly in my code.

How can I do this comparison?


Solution

  • Intel's manual documents how COMISD sets flags.

    First, check for unordered with a jp (if you want your code to work properly for NaN inputs).

    Then you can use any of the ja / jae / jb / jbe (above / below) conditions or their negatives (jna, etc), or je / jne (equal / not-equal). These are the same conditions as for unsigned integer compares. And obviously cmovcc and setcc work too.

    These conditions have synonyms like jc (jump if CF==1), but above/below have the right semantic meaning, so can reduce the amount of comments necessary to make your code human-readable.

    You can skip the jp before a condition like ja because CF=0 implies PF=0. (i.e. the a condition will be false if the operands were unordered).

    Some other conditions, like b (below: CF=1) or be (CF=1 or ZF=1), will be true for unordered operands, so you need to branch on jp before or after jb / jbe if you need to rule out NaN inputs.

    You can reverse the compare operands to allow for a ja instead of jbe (for example) and group the unordered case with the other "way", if I have the logic right.


    historical interest / why it was designed this way:

    Note that comisd's flag-setting matches what you get from x87 fcom / fnstsw ax / sahf. See also this x87 tutorial/guide for an example of using that sequence. But only for historical interest! fcomi, which also sets the flags the same way, has been around for over 20 years (P6), and is faster. (See the tag wiki for more links).

    x87 is only useful for 80 bit extended precision, since it's generally safe to assume that SSE2 is available, even in 32bit mode.

    See also Why do x86 FP compares set CF like unsigned integers, instead of using signed conditions? for a more detailed look at why x87 used those flags, which fcomi and SSE maintained compatibility with.