Search code examples
pythonfloating-pointprecision

Why is this result more accurate than another result by equivalent functions


I have the following question, why is myf(x) giving less accurate results than myf2(x). Here is my python code:

from math import e, log

def Q1():
    n = 15
    for i in range(1, n):
        #print(myf(10**(-i)))
        #print(myf2(10**(-i)))
    return

def myf(x):
    return ((e**x - 1)/x)

def myf2(x):
    return ((e**x - 1)/(log(e**x)))

Here is the output for myf(x):

1.0517091807564771
1.005016708416795
1.0005001667083846
1.000050001667141
1.000005000006965
1.0000004999621837
1.0000000494336803
0.999999993922529
1.000000082740371
1.000000082740371
1.000000082740371
1.000088900582341
0.9992007221626409
0.9992007221626409

myf2(x):

1.0517091807564762
1.0050167084168058
1.0005001667083415
1.0000500016667082
1.0000050000166667
1.0000005000001666
1.0000000500000017
1.000000005
1.0000000005
1.00000000005
1.000000000005
1.0000000000005
1.00000000000005
1.000000000000005

I believe it has something to do with the floating point number system in python in combination with my machine. The natural log of euler's number produces a number with more digits of precision than its equivalent number x as an integer.


Solution

  • Let's start with the difference between x and log(exp(x)), because the rest of the computation is the same.

    >>> for i in range(10):
    ...     x = 10**-i
    ...     y = exp(x)
    ...     print(x, log(y))
    ... 
    1 1.0
    0.1 0.10000000000000007
    0.01 0.009999999999999893
    0.001 0.001000000000000043
    0.0001 0.00010000000000004326
    9.999999999999999e-06 9.999999999902983e-06
    1e-06 9.99999999962017e-07
    1e-07 9.999999994336786e-08
    1e-08 9.999999889225291e-09
    1e-09 1.000000082240371e-09
    

    If you look closely, you might notice that there are errors creeping in. When 𝑖 = 0, there's no error. When 𝑖 = 1, it prints 0.1 for π‘₯ and 0.10000000000000007 for log(y), which is wrong only after 16 digits. By the time 𝑖 = 9, log(y) is wrong in half the digits from log(𝑒π‘₯).

    Since we know what the true answer is (π‘₯), we can easily compute what the relative error of the approximation is:

    >>> for i in range(10):
    ...     x = 10**-i
    ...     y = exp(x)
    ...     z = log(y)
    ...     print(i, abs((x - z)/z))
    ... 
    0 0.0
    1 6.938893903907223e-16
    2 1.0755285551056319e-14
    3 4.293440603042413e-14
    4 4.325966668215291e-13
    5 9.701576564765975e-12
    6 3.798286318045685e-11
    7 5.663213319457187e-10
    8 1.1077471033430869e-08
    9 8.224036409872509e-08
    

    Each step loses us about a digit of accuracy! Why?

    Each of the operations 10**-i, exp(x), and log(y) only introduces a tiny relative error into the result, less than 10βˆ’15.

    Suppose exp(x) introduces a relative error of 𝛿, returning the number 𝑒π‘₯β‹…(1 + 𝛿) instead of 𝑒π‘₯ (which, after all, is a transcendental number that can't be represented by a finite string of digits). We know that |𝛿| < 10βˆ’15, but what happens when we then try to compute log(𝑒π‘₯β‹…(1 + 𝛿)) as an approximation to log(𝑒π‘₯) = π‘₯?

    We might hope to get π‘₯β‹…(1 + πœ€) where πœ€ is very small. But log(𝑒π‘₯β‹…(1 + 𝛿)) = log(𝑒π‘₯) + log(1 + 𝛿) = π‘₯ + log(1 + 𝛿) = π‘₯β‹…(1 + log(1 + 𝛿)/π‘₯), so πœ€ = log(1 + 𝛿)/π‘₯. And even if 𝛿 is small, π‘₯ β‰ˆ 10βˆ’π‘– gets closer and closer to zero as 𝑖 increases, so the error log(1 + 𝛿)/π‘₯ β‰ˆ 𝛿/π‘₯ can get worse and worse as 𝑖 increases because 1/π‘₯ β†’ ∞.

    We say that the logarithm function is ill-conditioned near 1: if you evaluate it at an approximation to an input near 1, it can turn a very small input error into an arbitrarily large output error. In fact, you can only go a few more steps before exp(x) is rounded to 1 and so log(y) returns zero exactly.

    This is not because of anything in particular about floating-pointβ€”any kind of approximation would have the same effect with log! The condition number of a function is a property of the mathematical function itself, not of the floating-point arithmetic system. If the inputs came from physical measurements, you could run into the same problem.


    This is related to why the functions expm1 and log1p exist. Although the function log(𝑦) is ill-conditioned near 1, the function log(1 + 𝑦) is not, so log1p(y) computes it more accurately than evaluating it log(1 + y) can. Similarly, the subtraction in exp(x) - 1 is subject to catastrophic cancellation when 𝑒π‘₯ β‰ˆ 1, so expm1(x) computes 𝑒π‘₯ βˆ’ 1 more accurately than evaluating exp(x) - 1 can.

    expm1 and log1p are not the same functions as exp and log, of course, but sometimes you can rewrite subexpressions in terms of them to avoid ill-conditioned domains. In this case, for example, if you rewrite log(𝑒π‘₯) as log(1 + [𝑒π‘₯ βˆ’ 1]), and use expm1 and log1p to compute it, the round-trip is often computed exactly:

    >>> for i in range(10):
    ...     x = 10**-i
    ...     y = expm1(x)
    ...     z = log1p(y)
    ...     print(i, x, z, abs((x - z)/z))
    ... 
    0 1 1.0 0.0
    1 0.1 0.1 0.0
    2 0.01 0.01 0.0
    3 0.001 0.001 0.0
    4 0.0001 0.0001 0.0
    5 9.999999999999999e-06 9.999999999999999e-06 0.0
    6 1e-06 1e-06 0.0
    7 1e-07 1e-07 0.0
    8 1e-08 1e-08 0.0
    9 1e-09 1e-09 0.0
    

    For similar reasons, you might want to rewrite (exp(x) - 1)/x as expm1(x)/x. If you don't, then when exp(x) returns 𝑒π‘₯β‹…(1 + 𝛿) rather than 𝑒π‘₯, you will end up with (𝑒π‘₯β‹…(1 + 𝛿) βˆ’ 1)/π‘₯ = (𝑒π‘₯ βˆ’ 1 + 𝛿𝑒π‘₯)/π‘₯ = (𝑒π‘₯ βˆ’ 1)β‹…[1 + 𝛿𝑒π‘₯/(𝑒π‘₯ βˆ’ 1)]/π‘₯, which again may blow up because the error is 𝛿𝑒π‘₯/(𝑒π‘₯ βˆ’ 1) β‰ˆ 𝛿/π‘₯.


    However, it is not simply luck that the second definition seems to produce the correct results! This happens because the compounded errors—𝛿/π‘₯ from exp(x) - 1 in the numerator, 𝛿/π‘₯ from log(exp(x)) in the denominatorβ€”cancel each other out. The first definition computes the numerator badly and the denominator accurately, but the second definition computes them both about equally badly!

    In particular, when π‘₯ β‰ˆ 0, we have

    log(𝑒π‘₯β‹…(1 + 𝛿)) = π‘₯ + log(1 + 𝛿) β‰ˆ π‘₯ + 𝛿

    and

    𝑒π‘₯β‹…(1 + 𝛿) βˆ’ 1 = 𝑒π‘₯ + 𝛿𝑒π‘₯ βˆ’ 1 β‰ˆ 1 + π‘₯ + 𝛿 βˆ’ 1 = π‘₯ + 𝛿.

    Note that 𝛿 is the same in both cases, because it's the error of using exp(x) to approximate 𝑒π‘₯.

    You can test this experimentally by comparing against expm1(x)/x (which is an expression guaranteed to have low relative error, since division never makes errors much worse):

    >>> for i in range(10):
    ...     x = 10**-i
    ...     u = (exp(x) - 1)/log(exp(x))
    ...     v = expm1(x)/x
    ...     print(u, v, abs((u - v)/v))
    ... 
    1.718281828459045 1.718281828459045 0.0
    1.0517091807564762 1.0517091807564762 0.0
    1.0050167084168058 1.0050167084168058 0.0
    1.0005001667083415 1.0005001667083417 2.2193360112628554e-16
    1.0000500016667082 1.0000500016667084 2.220335028798222e-16
    1.0000050000166667 1.0000050000166667 0.0
    1.0000005000001666 1.0000005000001666 0.0
    1.0000000500000017 1.0000000500000017 0.0
    1.000000005 1.0000000050000002 2.2204460381480824e-16
    1.0000000005 1.0000000005 0.0
    

    This approximation π‘₯ + 𝛿 to the numerator and the denominator is best for π‘₯ closest to zero and worst for π‘₯ farthest from zeroβ€”but as π‘₯ gets farther from zero, the error magnification in exp(x) - 1 (from catastrophic cancellation) and in log(exp(x)) (from ill-conditioning of logarithm near 1) both diminish anyway, so the answer remains accurate.


    However, the benign cancellation in the second definition only works until π‘₯ is so close to zero that exp(x) is simply rounded to 1β€”at that point, exp(x) - 1 and log(exp(x)) both give zero and will end up trying to compute 0/0 which yields NaN and a floating-point exception.

    So you should use expm1(x)/x in practice, but it is a happy accident that outside the edge case where exp(x) is 1, two wrongs make a right in (exp(x) - 1)/log(exp(x)) giving good accuracy even as (exp(x) - 1)/x and (for similar reasons) expm1(x)/log(exp(x)) both give bad accuracy.