Search code examples
integerdoubleprecisionieee-754

What is the first value at which a IEEE 64-bit float cannot distinguish two integers?


I am counting on a Double (IEEE-conformant 64-bit float) to keep track of some very large numbers. The problem is that I need to maintain unit resolution of integers (the ones place).

The problem is that I know that, as IEEE floats increase in value, they decrease in precision. I am sure that at some point, we get something like this:

...
xxxxxxxxxxxxxxxxx04
xxxxxxxxxxxxxxxxx05
xxxxxxxxxxxxxxxxx06
xxxxxxxxxxxxxxxxx08
xxxxxxxxxxxxxxxxx10
...

where some integers cannot be represented at all. I want to know where that point is, so that I might be able to guard against it, warn users, or set company policy.


Note that the I am using floating-point numbers for a good reason; the fraction part is also important to the application, but it is far less important than the integer part.


Solution

  • The IEEE-754 basic 64-bit binary floating-point format uses 53-bit significands. Every finite number it encodes has the form sF • 2e, where s (for sign) is +1 or −1, F (for fraction) is the number of a 53-bit binary numeral b.bbbbbb (note the “.” after the first bit), and e is an exponent from −1022 to +1023. (Often, floating-point numbers are normalized by shifting the bits to move the first 1 bit to the leading b position and adjusting the exponent to compensate. For discussion of representable values, we can ignore this; all numbers representable with a leading 0 are also representable with a leading 1, as long as the exponent is within bounds, which it is for this question.)

    This tells us everything needed to answer the question. Every integer from 0 to 253−1 is representable in the form +1 • b.bbbbbb • 252 for some combination of values of the bits b.bbbbbb. And 253 is representable as +1 • 1.000…000 • 253. But 253+1 is not representable because it would need an F in the form 1.000…0001, where 54 bits are required (the first for 253 and the last for 20).

    After that, 253+2 is representable, because spanning its highest and lowest 1 bit requires only 53 bits (from 253 to 21). So, at this magnitude, the even integers are representable, and the odd integers are not.

    So, the transition from consecutive representable integers to non-representable integers looks like:

    • 9,007,199,254,740,919 = 253−3 is representable.
    • 9,007,199,254,740,920 = 253−2 is representable.
    • 9,007,199,254,740,921 = 253−1 is representable.
    • 9,007,199,254,740,922 = 253 is representable.
    • 9,007,199,254,740,923 = 253+1 is not representable.
    • 9,007,199,254,740,924 = 253+2 is representable.
    • 9,007,199,254,740,925 = 253+3 is not representable.
    • 9,007,199,254,740,926 = 253+4 is representable.