Search code examples
assemblyx86floating-pointx87

Why does FLD m80fp not raise an exception for SNaN inputs while FLD of double or float can?


Here are possible exception when using FLD:

  • #IS Stack underflow or overflow occurred.
  • #IA Source operand is an SNaN. Does not occur if the source operand is in double extended-precision floating-point format (FLD m80fp or FLD ST(i)).
  • #D Source operand is a denormal value. Does not occur if the source operand is in double extended-precision floating-point format.

Why does the #IA exception "not occur if the source operand is in double extended-precision floating-point format" ?

I think double-precision-floating-point and double-extended-precision-floating-point format are basically the same. Both are capable of encode SNaN.

Is there any logical reason for that difference or is it just the way it is?


Solution

  • fld m64fp and m32fp have to convert to the internal 80-bit format used in x87 registers. You can think of it as this conversion process that can raise an exception of #SNaN.

    fld m80fp is just a pure load of data that's already in the native internal format, like frstor.

    (Wikipedia claims AMD CPUs do signal an FP exception on SNaN 80-bit loads, that this is one of the minor differences between AMD and Intel implementations of x86-64. But that Wiki list has proved inaccurate before, e.g. a bsf/bsr claim seemed to be based on a misunderstanding of the docs, not actual experiments. @Electro comments that AMD's manual says that fld won't raise an FP exception when the source format is 80-bit. But that fst/fstp always raise an FP exception when storing an SNaN.)

    The conversion from float or double to extended 80-bit has to examine the bits of the of the source float, extending the mantissa and adding an explicit leading 1 or 0 depending on the exponent field being non-zero (normal or subnormal).

    This explicit vs. implicit mantissa bit is a major difference between x87 double-extended vs. IEEE binary64 aka double or qword. Both can encode SNaN but they're definitely not "basically the same" the way binary32 and binary64 are (just wider fields).


    This "inconsistency" presumably dates back to 8087 when transistor budgets were very limited; making fld m80fp check for an SNaN even though it's not using the usual conversion hardware would have cost extra transistors.

    Note that fld m80fp is the only way you can make the x87 FPU read a tbyte FP value (other than frstor or the more modern fxrstor or xrstor). There is no fadd m80fp or anything. So no operation that involves reading an m80fp from memory ever has to raise an exception for SNaN.

    There are memory source operand forms of most FP math instructions, like fadd st0, m64fp and fadd st0, m32fp, which presumably also need to convert to internal format as part of their operation. So it makes sense that you'd want to detect a memory-source SNaN as part of that conversion.

    So if you were designing 8087, it would make sense to have the logic that handles loads from memory check for SNaN while converting 32 and 64-bit inputs, but not while just loading the 80-bit native format. This is probably where Intel originally inherited that behaviour from, and it made no sense to make later CPUs different so they preserved this behaviour.

    IDK whether to look at that as a downside, or whether it's actually a good think that you can load 80-bit native FP values without the possibility of raising an exception. AMD apparently decided not, and does signal an FP exception on fld m80fp of SNaN.

    Or you could look at it as a bad thing that fld dword / qword might raise an exception when just re-formatting a float with no possibility of data loss, and not doing any actual computation.


    Background:

    Normally you never encounter an SNaN in the first place. outputs of invalid operations like division by 0 are QNaNs, IIRC. So you only get SNaN if you create one yourself with integer instructions or as static constant data. (I think.)

    And of course normally you have FP exceptions masked so it doesn't fault, just sets a sticky bit in the FP status word.