Search code examples
javamathfloating-pointnanieee-754

The sum of NaNs: How is the underlying bit pattern calculated?


Question: How does the JVM compute the sum of two double-precision NaNs?

Details: The IEEE 754 specification reserves two ranges of bit patterns for NaNs:

0x7ff0000000000001 -> 0x7fffffffffffffff

and

0xfff0000000000001 -> 0xffffffffffffffff.

IEEE 754 requires the sum of two NaNs to be a NaN, but, as far as I can tell, is silent on implementation details. So, to get to my question: If we write b(x) for the hexadecimal bit pattern of a NaN x, I'd like to know: How does the JVM calculate b(x + y) from b(x) and b(y)? Playing with a little bit of code, I'm lead to believe:

Claim: Let t = 0x0008000000000000. If b(y) + t is in a valid NaN range, then:

b(x + y) = b(y) + t

otherwise,

b(x + y) = b(y).

This seems strange to me, and I'd like to know more. For the record, I'm using Java 8 on an Intel i7 MacBook (in case the Java version or physical hardware matter.) Here are two examples:

Example 1, where b(x + y) = b(y):

b(x) = 0x7fffddee0f43e7d4
b(y) = 0xfffaeaba08397e4e
b(x + y) = 0xfffaeaba08397e4e

Example 2, where b(x + y) = b(y) + t:

b(x) = 0xffff4f0202031106
b(y) = 0xfff79342c97104ff
b(x + y) = 0xffff9342c97104ff

Does anyone know how the sum is being evaluated by the JVM?

Thanks!


Solution

  • For floating-point, most implementations simply rely on what the underlying hardware does. Assuming you're running this on some sort of an x86 architecture, here're the rules: (You can find this in Section 4.8.3.5 of https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf):

    enter image description here