Search code examples
floating-pointprecisionfloating-accuracylogarithm

Logarithm to integer base of integer


I would like to find the upward rounded value of logarithm of integer n to integer base b. In code:

result = int(ceil(log(n, b)))

The problem is that sometimes the value cannot be represented exactly in floating point, overestimating the result. For example:

log(125, 5) == int(ceil(3.0000000000000004)) == 4

What can I do about this? Subtracting tiny epsilon would underestimate it elsewhere. Is there a way to side step floating point calculation entirely, kind of like when using base 2?

I could use a loop to find the logarithm but I was wondering whether it is possible to do this in constant time.


Solution

  • Well, you have to be careful by what you mean by constant time. If you're dealing with fixed-width integers, then a simple for loop is bounded (the worst case for a 64-bit integer in base 2 will be 64 multiplications) and may still be faster than casting to a float, calling the log function, and truncating back to an integer.

    If you're using arbitrary-precision integers, then there's basically no such thing as a constant-time algorithm, as almost every operation is at least O(log(n)), including casting to a float.

    That said, there are a couple of other things you can try:

    • You can use a find first set operation to find base-2 logarithm (Python ≥3.1 provides this function through int.bit_length()). x86 provides the bsr instructions for this purpose. This is also one of the few operations on arbitrary precision integers that could be constant time (though that would depend on implementation, memory storage etc).

    • Once you have a base-2 logarithm, you can use it to compute any power-of-2 by integer division.

    • If you're only using the same base b, and the inputs are bounded by bk you could use a precomputed lookup table of the powers of b combined with binary search, which would be O(log(k) * log(n)) for arbitrary precision integers (log(k) for the search, log(n) for each inequality comparison).

    • Even if this is not the case, you could still try some sort of binary search: keep doubling the exponent by squaring until too big, then binary search from there.

    • Your original idea, combined with a bit of error analysis, can quickly compute some cases, then you can fall back on the inexact ones. Python doesn't provide error bounds for the 2-argument log (but it's not great, since the example you provide should be exact), but nowadays most decent math libraries are able to compute the 1-argument log to within 1 ulp (unit in last place), and cast to float and float division to within 1/2 ulp, giving total relative error of 3 ulps (since these are all multiplicitive), assuming your base is exactly representable as a float (i.e. not a massive integer like 1e30).

    In python, this would be:

    import math, sys
    def clog(n,b):
        a = math.log(n)/math.log(b)
        r = round(a)
        i = int(r)
        if abs(a-r) <= a*3*sys.float_info.epsilon:
            # slow
            if n > b**i:
                return i+1
            else:
                return i
        else:
            return int(math.ceil(a))