Search code examples
cfloating-pointcomparison

Compare floating point numbers as integers


Can two floating point values (IEEE 754 binary64) be compared as integers? Eg.

long long a = * (long long *) ptr_to_double1,
          b = * (long long *) ptr_to_double2;
if (a < b) {...}

assuming the size of long long and double is the same.


Solution

  • No. Two floating point values (IEEE 754 binary64) cannot compare simply as integers with if (a < b).

    IEEE 754 binary64

    The order of the values of double is not the same order as integers (unless you are are on a rare sign-magnitude machine). Think positive vs. negative numbers.

    double has values like 0.0 and -0.0 which have the same value but different bit patterns.

    double has "Not-a-number"s that do not compare like their binary equivalent integer representation.

    If both the double values were x > 0 and not "Not-a-number", endian, aliasing, and alignment, etc. were not an issue, OP's idea would work.

    Alternatively, a more complex if() ... condition would work - see below

    [non-IEEE 754 binary64]

    Some double use an encoding where there are multiple representations of the same value. This would differ from an "integer" compare.


    Tested code: needs 2's complement, same endian for double and the integers, does not account for NaN.

    int compare(double a, double b) {
      union {
        double d;
        int64_t i64;
        uint64_t u64;
      } ua, ub;
      ua.d = a;
      ub.d = b;
      // Cope with -0.0 right away
      if (ua.u64 == 0x8000000000000000) ua.u64 = 0;
      if (ub.u64 == 0x8000000000000000) ub.u64 = 0;
      // Signs differ?
      if ((ua.i64 < 0) != (ub.i64 < 0)) {
        return ua.i64 >= 0 ? 1 : -1;
      }
      // If numbers are negative
      if (ua.i64 < 0) {
        ua.u64 = -ua.u64;
        ub.u64 = -ub.u64;
      }
      return (ua.u64 > ub.u64)  - (ua.u64 < ub.u64);
    }
    

    Thanks to @David C. Rankin for a correction.

    Test code

    void testcmp(double a, double b) {
      int t1 = (a > b) - (a < b);
      int t2 = compare(a, b);
      if (t1 != t2) {
        printf("%le %le %d %d\n", a, b, t1, t2);
      }
    
    }
    
    #include <float.h>
    void testcmps() {
      // Various interesting `double`
      static const double a[] = { 
          -1.0 / 0.0, -DBL_MAX, -1.0, -DBL_MIN, -0.0, 
          +0.0, DBL_MIN, 1.0, DBL_MAX, +1.0 / 0.0 };
    
      int n = sizeof a / sizeof a[0];
      for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
          testcmp(a[i], a[j]);
        }
      }
      puts("!");
    }