Search code examples
floating-pointieee-754

Smallest float X s.t. 1/X is not infinity


What is the smallest single and double precision float such that its reciprocal is still not equal infinity under IEEE 754?

Edit: I'm asking it because I just want to understand how it works


Solution

  • Let us use IEEE 754 double-precision as an example. We assume that it is mapped to double in our C compilation platform. C99 hexadecimal notation is convenient so we'll take advantage of it. Let us also assume that long double has at least one extra bit of precision over double, for instance, long double is Intel's 80-bit “extended double“.

    The double operation 1.0 / x rounds to +inf if and only if the mathematical result of the division is above the number 1.fffffffffffff8p1023L. This number is not representable as a double, but it is exactly the midpoint between DBL_MAX and what would be the next double value after DBL_MAX if the double exponent had a wider range. This is how IEEE 754 defines whether basic operations such as / should round to infinity.

    Therefore, the highest value double value to round to +inf when reciprocated can be computed with the following steps:

    • set the rounding mode to FE_DOWNWARD
    • compute 1.0L / 1.fffffffffffff8p1023L
    • (while still in round-downwards mode) round the result to double.

    The smallest value not to round to infinity is the one just after that. It can be computed with nextafter, as standardized e.g. in POSIX.

    Translating these four steps to C should be straightforward (do not forget #pragma STDC FENV_ACCESS ON). Or, as Thomas Weller recommended, brute-force it. A search by dichotomy would take less than 64 steps.

    NOTE: it is possible to compute the smallest value that reciprocates to a finite result by using the FE_UPWARD rounding mode and only three steps, but this relies on the additional property that 1.0L / 1.fffffffffffff8p1023L cannot be an exact operation. The four-step method is conceptually cleaner.