What is the smallest single and double precision float such that its reciprocal is still not equal infinity under IEEE 754?
Edit: I'm asking it because I just want to understand how it works
Let us use IEEE 754 double-precision as an example. We assume that it is mapped to double
in our C compilation platform. C99 hexadecimal notation is convenient so we'll take advantage of it. Let us also assume that long double
has at least one extra bit of precision over double
, for instance, long double
is Intel's 80-bit “extended double“.
The double
operation 1.0 / x
rounds to +inf
if and only if the mathematical result of the division is above the number 1.fffffffffffff8p1023L
. This number is not representable as a double
, but it is exactly the midpoint between DBL_MAX
and what would be the next double
value after DBL_MAX
if the double
exponent had a wider range. This is how IEEE 754 defines whether basic operations such as /
should round to infinity.
Therefore, the highest value double
value to round to +inf
when reciprocated can be computed with the following steps:
FE_DOWNWARD
1.0L / 1.fffffffffffff8p1023L
double
.The smallest value not to round to infinity is the one just after that. It can be computed with nextafter
, as standardized e.g. in POSIX.
Translating these four steps to C should be straightforward (do not forget #pragma STDC FENV_ACCESS ON
). Or, as Thomas Weller recommended, brute-force it. A search by dichotomy would take less than 64 steps.
NOTE: it is possible to compute the smallest value that reciprocates to a finite result by using the FE_UPWARD
rounding mode and only three steps, but this relies on the additional property that 1.0L / 1.fffffffffffff8p1023L
cannot be an exact operation. The four-step method is conceptually cleaner.