Search code examples
pythonmathfloating-pointrounding-error

When does using floating point operations to calculate ceiling-integer-log2 fail?


I'm curious what the first input is which differentiates these two functions:

from math import *

def ilog2_ceil_alt(i: int) -> int:
    # DO NOT USE THIS FUNCTION IT'S WRONG
    return ceil(log2(i))

def ilog2_ceil(i: int) -> int:
    # Correct
    if i <= 0:
        raise ValueError("math domain error")
    return (i-1).bit_length()

...

Obviously, the first one is going to fail for some inputs due to rounding/truncation errors when cramming (the log of) an unlimited-sized integer through a finite double in the pipeline— however, I tried running this test code for a few minutes, and it didn't find a problem:

...

def test(i):
    if ilog2_ceil(i) != ilog2_ceil_alt(i):
        return i

def main(start=1):
    import multiprocessing, itertools
    p = multiprocessing.Pool()
    it = p.imap_unordered(test, itertools.count(start), 100)
    return next(i for i in it if i is not None)

if __name__ == '__main__':
    i = main()
    print("Failing case:", i)

I tried testing assorted large values like 2**32 and 2**64 + 9999, but it didn't fail on these.

What is the smallest (positive) integer for which the "alt" function fails?


Solution

  • A first issue with ceil(log2(i)) is that the integer i will first be converted to the floating-point type toward 0 (this is the wrong direction!), here with a 53-bit precision. For instance, if i is 253 + 1, it will be converted to 253, and you'll get 53 instead of 54.

    But another problem may occur even with smaller values: The values to consider are those that are slightly larger than a power of 2, say 2n + k with a small integer k > 0, because the log2 may round to the integer n (even if log2 is correctly rounded) while you would want a FP number slightly larger than n at this point. Thus this will give ceil(n), i.e. n instead of n+1.

    Now, let's consider the worst case for k, i.e. k = 1. Let p denote the precision (on your example, p = 53 for the double precision, but let's generalize). One has log2(2n + 1) = n + log2(1 + 2−n) ≈ n + 0.43·2−n. If the representation of n needs exactly q bits, then the ulp will be 2q−p. To get the expected result, 0.43·2−n needs to be larger than 1/2 ulp, i.e. 2−n−2 ⩾ 2q−p−1, i.e. n ⩽ p−q−1.

    Since q = ⌈log2(n)⌉, the condition is n + ⌈log2(n)⌉ ⩽ p − 1.

    But since ⌈log2(n)⌉ is small compared to n, the maximum value of n will be of the order of p, so that one gets an approximate condition by replacing ⌈log2(n)⌉ by ⌈log2(p)⌉, i.e. n ⩽ p − ⌈log2(p)⌉ − 1.

    Here's a C program using GNU MPFR to find the first value of n that fails, for each precision p:

    #include <stdio.h>
    #include <stdlib.h>
    #include <mpfr.h>
    
    static void test (long p)
    {
      mpfr_t x;
    
      mpfr_init2 (x, p);
      for (long n = 1; ; n++)
        {
          /* i = 2^n + 1 */
          mpfr_set_ui_2exp (x, 1, n, MPFR_RNDN);
          mpfr_add_ui (x, x, 1, MPFR_RNDN);
          mpfr_log2 (x, x, MPFR_RNDN);
          mpfr_ceil (x, x);
          long r = mpfr_get_si (x, MPFR_RNDN);
          if (r != n + 1)
            {
              printf ("p = %ld, fail for n = %ld\n", p, n);
              break;
            }
        }
      mpfr_clear (x);
    }
    
    int main (int argc, char **argv)
    {
      long p, pmax;
    
      if (argc != 2)
        exit (1);
      pmax = strtol (argv[1], NULL, 0);
      for (p = 2; p <= pmax; p++)
        test (p);
    
      return 0;
    }
    

    With the argument 64, one gets:

    p = 2, fail for n = 2
    p = 3, fail for n = 3
    p = 4, fail for n = 4
    p = 5, fail for n = 4
    p = 6, fail for n = 5
    p = 7, fail for n = 6
    p = 8, fail for n = 7
    p = 9, fail for n = 8
    p = 10, fail for n = 8
    p = 11, fail for n = 9
    p = 12, fail for n = 10
    p = 13, fail for n = 11
    p = 14, fail for n = 12
    p = 15, fail for n = 13
    p = 16, fail for n = 14
    p = 17, fail for n = 15
    p = 18, fail for n = 16
    p = 19, fail for n = 16
    p = 20, fail for n = 17
    p = 21, fail for n = 18
    p = 22, fail for n = 19
    p = 23, fail for n = 20
    p = 24, fail for n = 21
    p = 25, fail for n = 22
    p = 26, fail for n = 23
    p = 27, fail for n = 24
    p = 28, fail for n = 25
    p = 29, fail for n = 26
    p = 30, fail for n = 27
    p = 31, fail for n = 28
    p = 32, fail for n = 29
    p = 33, fail for n = 30
    p = 34, fail for n = 31
    p = 35, fail for n = 32
    p = 36, fail for n = 32
    p = 37, fail for n = 33
    p = 38, fail for n = 34
    p = 39, fail for n = 35
    p = 40, fail for n = 36
    p = 41, fail for n = 37
    p = 42, fail for n = 38
    p = 43, fail for n = 39
    p = 44, fail for n = 40
    p = 45, fail for n = 41
    p = 46, fail for n = 42
    p = 47, fail for n = 43
    p = 48, fail for n = 44
    p = 49, fail for n = 45
    p = 50, fail for n = 46
    p = 51, fail for n = 47
    p = 52, fail for n = 48
    p = 53, fail for n = 49
    p = 54, fail for n = 50
    p = 55, fail for n = 51
    p = 56, fail for n = 52
    p = 57, fail for n = 53
    p = 58, fail for n = 54
    p = 59, fail for n = 55
    p = 60, fail for n = 56
    p = 61, fail for n = 57
    p = 62, fail for n = 58
    p = 63, fail for n = 59
    p = 64, fail for n = 60
    

    EDIT: So, for double precision (p = 53), if log2 is correctly rounded (or at least, has a good accuracy), the smallest failing integer is 249 + 1.