Search code examples
cfloating-pointieee-754floating-point-precisionrepresentation

Finding next IEEE 754 representable number (towards -INF) with C?


I am trying to write a function that takes in a 32-bit floating point number (that has been converted from a 32-bit binary string) and returns the previous representable float in 32-bit binary. So far I have the conversion from binary to float down, but I am having troubles understanding how to find the next representable IEEE 754 value. Can't you just subtract the smallest representable value possible (000 0000 0000 0000 0000 0001)? Also, what (if any) are the benefits of converting from IEEE 754 to Float before finding the closest representable binary value?

So far I only have a function that converts a floating point number to simple precision 32-bit binary. I would include my code, but this is for school so I feel iffy about putting it online/getting explicit corrections and advice.


Solution

  • Q: Can't you just subtract the smallest representable value possible?
    A: No. Floating point numbers are distributed logarithmically, not linearly. Subtracting any fixed value like 0.000001 would have no effect on large float and have overly large effects on tiny float values.

    Q: what ... are the benefits of converting from IEEE 754 to Float before finding the closest representable binary value?
    A: "IEEE 754" to "Float" are typically the same type - no conversion occurs. Both are 32-bit numeric representations.

    Following depends on float being IEEE 754 binary32. It also depend on the endian of the int32_t and float to match. It returns NaN when the input is -INF.

    float nextdown(float x) {
      union {
        float x;
        int32_t i;
      } u;
      u.x = x;
      if (u.i > 0) {
        u.i--;
      }
      else if (u.i < 0) {
        u.i++;
      }
      else {
        u.i = 0x80000001;
      }
      return u.x;
    }
    

    The above does not well handle NaN. A simple extra test:

    float nextdown(float x) {
      // catch NaN
      if (x != x) return x;
    
      union {
        float x;
        int32_t i;
      } u;
      ...
    

    Note: OP's desired function is nearly the exact same as <math.h> nextafterf(x,-1.0f/0.0f) which was used to test this code. Differences in NaN and -INF.