Usually nextafter
is implemented in the following way:
double nextafter(double x, double y)
{
// handle corner cases
int delta = ((x > 0) == (x < y)) ? 1 : -1;
unsigned long long mant = __mant(x); // get mantissa as int
mant += delta;
...
}
Here, a binary representation is obtained using __mant(x)
.
Out of curiosity: is it possible to implement nextafter
without obtaining a binary representation? For example, using a sequence of arithmetic floating point operations.
The code below implements nextafter
in the ascending direction for finite values for IEEE-754 arithmetic with round-to-nearest-ties-to-even. Handling of NaNs, infinities, and the descending direction is obvious.
Without assuming IEEE-754 or round-to-nearest, the floating-point properties are sufficiently well characterized by C 2018 5.2.4.2.2 that we we can implement nextafter (again in the ascending direction) this way:
-DBL_MAX
.-DBL_TRUE_MIN
, return zero.+DBL_TRUE_MIN
.+DBL_MAX
, return +∞.nextafter(x, y)
implementation, as it moves the first argument in the direction of the second argument, so we never ascend from +∞ because we never receive a second argument greater than +∞.)logb
to find the exponent e
. If e
is less than DBL_MIN
, return the input plus DBL_TRUE_MIN
(the ULP of subnormals and the lowest normals). If e
is not less than DBL_MIN
, return the input plus scalb(1, e + 1 - DBL_MANT_DIG)
(the specific ULP for the input). Rounding method is irrelevant as these additions are exact.FLT_RADIX
(the input equals scalb(1, e)
), decrement the second argument of scalb
by one (because this nextafter
step transitions from a greater exponent to a lower one).Note that FLT_RADIX
is correct above; there is no DBL_RADIX
; all floating-point formats use the same radix.
If you want to consider logb
and scalb
as functions that manipulate the floating-point representation, then they could be replaced by ordinary arithmetic. log
could find a quick approximation that could be quickly refined to the true exponent, and scalb
can be implemented in a variety of ways, possibly simply a table look-up. If log
remains objectionable, then trial comparisons would suffice.
The above handles formats with or without subnormals because, if subnormals are supported, it steps into them with the decrement, and, if subnormals are not supported, the minimum normal magnitude is DBL_TRUE_MIN
, so it is recognized in the above as the point where we step to zero next.
There is one caveat; the C standard allows it to be “indeterminable” whether an implementation supports subnormals or not “if floating-point operations do not consistently interpret subnormal representations as zero, nor as nonzero.” In that case, I do not see that the standard specifies what the standard nextafter
does, so there is nothing for us to do to match it in our implementation. Supposing that subnormals are sometimes supported, DBL_TRUE_MIN
must be a subnormal value, and the above will attempt to work as if subnormal support is currently on (e.g., flush-to-zero is off) and, if it is off, you will get whatever you get.
#include <float.h>
#include <math.h>
/* Return the next floating-point value after the finite value q.
This was inspired by Algorithm 3.5 in Siegfried M. Rump, Takeshi Ogita, and
Shin'ichi Oishi, "Accurate Floating-Point Summation", _Technical Report
05.12_, Faculty for Information and Communication Sciences, Hamburg
University of Technology, November 13, 2005.
IEEE-754 and the default rounding mode,
round-to-nearest-ties-to-even, may be required.
*/
double NextAfter(double q)
{
/* Scale is .625 ULP, so multiplying it by any significand in [1, 2)
yields something in [.625 ULP, 1.25 ULP].
*/
static const double Scale = 0.625 * DBL_EPSILON;
/* Either of the following may be used, according to preference and
performance characteristics. In either case, use a fused multiply-add
(fma) to add to q a number that is in [.625 ULP, 1.25 ULP]. When this
is rounded to the floating-point format, it must produce the next
number after q.
*/
#if 0
// SmallestPositive is the smallest positive floating-point number.
static const double SmallestPositive = DBL_EPSILON * DBL_MIN;
if (fabs(q) < 2*DBL_MIN)
return q + SmallestPositive;
return fma(fabs(q), Scale, q);
#else
return fma(fmax(fabs(q), DBL_MIN), Scale, q);
#endif
}
#if defined CompileMain
#include <stdio.h>
#include <stdlib.h>
#define NumberOf(a) (sizeof (a) / sizeof *(a))
int main(void)
{
int status = EXIT_SUCCESS;
static const struct { double in, out; } cases[] =
{
{ INFINITY, INFINITY },
{ 0x1.fffffffffffffp1023, INFINITY },
{ 0x1.ffffffffffffep1023, 0x1.fffffffffffffp1023 },
{ 0x1.ffffffffffffdp1023, 0x1.ffffffffffffep1023 },
{ 0x1.ffffffffffffcp1023, 0x1.ffffffffffffdp1023 },
{ 0x1.0000000000003p1023, 0x1.0000000000004p1023 },
{ 0x1.0000000000002p1023, 0x1.0000000000003p1023 },
{ 0x1.0000000000001p1023, 0x1.0000000000002p1023 },
{ 0x1.0000000000000p1023, 0x1.0000000000001p1023 },
{ 0x1.fffffffffffffp1022, 0x1.0000000000000p1023 },
{ 0x1.fffffffffffffp1, 0x1.0000000000000p2 },
{ 0x1.ffffffffffffep1, 0x1.fffffffffffffp1 },
{ 0x1.ffffffffffffdp1, 0x1.ffffffffffffep1 },
{ 0x1.ffffffffffffcp1, 0x1.ffffffffffffdp1 },
{ 0x1.0000000000003p1, 0x1.0000000000004p1 },
{ 0x1.0000000000002p1, 0x1.0000000000003p1 },
{ 0x1.0000000000001p1, 0x1.0000000000002p1 },
{ 0x1.0000000000000p1, 0x1.0000000000001p1 },
{ 0x1.fffffffffffffp-1022, 0x1.0000000000000p-1021 },
{ 0x1.ffffffffffffep-1022, 0x1.fffffffffffffp-1022 },
{ 0x1.ffffffffffffdp-1022, 0x1.ffffffffffffep-1022 },
{ 0x1.ffffffffffffcp-1022, 0x1.ffffffffffffdp-1022 },
{ 0x1.0000000000003p-1022, 0x1.0000000000004p-1022 },
{ 0x1.0000000000002p-1022, 0x1.0000000000003p-1022 },
{ 0x1.0000000000001p-1022, 0x1.0000000000002p-1022 },
{ 0x1.0000000000000p-1022, 0x1.0000000000001p-1022 },
{ 0x0.fffffffffffffp-1022, 0x1.0000000000000p-1022 },
{ 0x0.ffffffffffffep-1022, 0x0.fffffffffffffp-1022 },
{ 0x0.ffffffffffffdp-1022, 0x0.ffffffffffffep-1022 },
{ 0x0.ffffffffffffcp-1022, 0x0.ffffffffffffdp-1022 },
{ 0x0.0000000000003p-1022, 0x0.0000000000004p-1022 },
{ 0x0.0000000000002p-1022, 0x0.0000000000003p-1022 },
{ 0x0.0000000000001p-1022, 0x0.0000000000002p-1022 },
{ 0x0.0000000000000p-1022, 0x0.0000000000001p-1022 },
{ -0x1.fffffffffffffp1023, -0x1.ffffffffffffep1023 },
{ -0x1.ffffffffffffep1023, -0x1.ffffffffffffdp1023 },
{ -0x1.ffffffffffffdp1023, -0x1.ffffffffffffcp1023 },
{ -0x1.0000000000004p1023, -0x1.0000000000003p1023 },
{ -0x1.0000000000003p1023, -0x1.0000000000002p1023 },
{ -0x1.0000000000002p1023, -0x1.0000000000001p1023 },
{ -0x1.0000000000001p1023, -0x1.0000000000000p1023 },
{ -0x1.0000000000000p1023, -0x1.fffffffffffffp1022 },
{ -0x1.0000000000000p2, -0x1.fffffffffffffp1 },
{ -0x1.fffffffffffffp1, -0x1.ffffffffffffep1 },
{ -0x1.ffffffffffffep1, -0x1.ffffffffffffdp1 },
{ -0x1.ffffffffffffdp1, -0x1.ffffffffffffcp1 },
{ -0x1.0000000000004p1, -0x1.0000000000003p1 },
{ -0x1.0000000000003p1, -0x1.0000000000002p1 },
{ -0x1.0000000000002p1, -0x1.0000000000001p1 },
{ -0x1.0000000000001p1, -0x1.0000000000000p1 },
{ -0x1.0000000000000p-1021, -0x1.fffffffffffffp-1022 },
{ -0x1.fffffffffffffp-1022, -0x1.ffffffffffffep-1022 },
{ -0x1.ffffffffffffep-1022, -0x1.ffffffffffffdp-1022 },
{ -0x1.ffffffffffffdp-1022, -0x1.ffffffffffffcp-1022 },
{ -0x1.0000000000004p-1022, -0x1.0000000000003p-1022 },
{ -0x1.0000000000003p-1022, -0x1.0000000000002p-1022 },
{ -0x1.0000000000002p-1022, -0x1.0000000000001p-1022 },
{ -0x1.0000000000001p-1022, -0x1.0000000000000p-1022 },
{ -0x1.0000000000000p-1022, -0x0.fffffffffffffp-1022 },
{ -0x0.fffffffffffffp-1022, -0x0.ffffffffffffep-1022 },
{ -0x0.ffffffffffffep-1022, -0x0.ffffffffffffdp-1022 },
{ -0x0.ffffffffffffdp-1022, -0x0.ffffffffffffcp-1022 },
{ -0x0.0000000000004p-1022, -0x0.0000000000003p-1022 },
{ -0x0.0000000000003p-1022, -0x0.0000000000002p-1022 },
{ -0x0.0000000000002p-1022, -0x0.0000000000001p-1022 },
{ -0x0.0000000000001p-1022, -0x0.0000000000000p-1022 },
};
for (int i = 0; i < NumberOf(cases); ++i)
{
double in = cases[i].in, expected = cases[i].out;
double observed = NextAfter(in);
printf("NextAfter(%a) = %a.\n", in, observed);
if (! (observed == expected))
{
printf("\tError, expected %a.\n", expected);
status = EXIT_FAILURE;
}
}
return status;
}
#endif // defined CompileMain