Search code examples
floating-pointbinaryinteger64-bitbit

If A<B == true if A and B are Doubles. It I copy to the Bits to two uint64_t will A<B still be true


I'm doing some stuff with some crypto libraries and Zero Knowledge Proofs and the libraries I'm using only support Integers.

If I have two numbers stored as Doubles Precision Floats, and I copy the bits for each number into an Integer and I then compare the Integer will.

A_Double>B_Double==A_UINT>B_UINT

for all values.

My guess is for all positive values it might work, but not for negative values.

Although it might work if I flip the first bit.


Solution

  • On a close track

    With a lot of assumptions about the floating point encoding:

    When the integer version of a < 0 flip the value, folding at 0footnote so +0.0 and -0.0 compare as equal. FP values are often encoded like sign-magnitude.

    Also look for Nan encodings.


    A C solution

    #include <stdint.h>
    #include <stdlib.h>
    
    _Static_assert(sizeof(int64) == sizeof(double), "Unexpected sizes");
    
    #define INT_NAN_TEST(a) (llabs(a) > 0x7FF0000000000000u)
    
    /*
     * Compare 2 integers as is they overlay a same size, same endian 64-bit `double`.
     * Return -1,0,1 when they compare a > b, a===b, a< b
     * or `'?` for un-comparable as at least one of `a,b` is NaN
     *
     * This code assumes a lot about the representation of a `double`.
     * Same size
     * Same endian
     * FP with leading sign bit, then a biased exponent, then significand
     */
    int cmp_double(int64_t a, int64_t b) {
      if (a < 0) {
        a = 0x8000000000000000 - a;
      }
      if (b < 0) {
        b = 0x8000000000000000 - b;
      }
      if (INT_NAN_TEST(a) || INT_NAN_TEST(b))
        return '?';
      return (a > b) - (a < b);
    }
    

    Test

    #include <stdlib.h>
    #include <stdio.h>
    #include <float.h>
    #include <limits.h>
    #include <string.h>
    #include <math.h>
    
    int main() {
      const double d[] = {0.0, DBL_TRUE_MIN, DBL_MIN, 1, DBL_MAX, INFINITY, -0.0,
          -DBL_TRUE_MIN, -DBL_MIN, -1, -DBL_MAX, -INFINITY, NAN};
      size_t n = sizeof d / sizeof *d;
      for (size_t i = 0; i < n; i++) {
        double a = d[i];
        int64_t ai;
        memcpy(&ai, &a, sizeof ai);
        for (size_t bb = 0; bb < n; bb++) {
          double b = d[bb];
          int64_t bi;
          memcpy(&bi, &b, sizeof bi);
          int cmp_d = (isnan(a) || isnan(b)) ? '?' : (a > b) - (a < b);
          int cmp_i = cmp_double(ai, bi);
          if (cmp_d != cmp_i) {
            printf("% 20g %16llX % 20g %16llX: %2d %2d\n", a, 1llu * ai, b,
                1llu * bi, cmp_d, cmp_i);
          }
        }
      }
      puts("Done");
    }
    

    footnote By folding negative numbers so 0000_0000_0000_0000 and 8000_0000_0000_0000 are both "zero", the most negative "double as an integer" value will be 1 more than the most negative encodable integer, thus preventing UB in the llabs() calculation.