Search code examples
numpymatlabprecisionnumericlogarithm

Mismatch between Matlab log and Numpy np.log


While rewriting an old Matlab code to NumPy, I noticed differences in logarithmic calculation. In NumPy, I use np.log, Matlab uses log function.

b = [1 1 2 3 5 1 1];
p = b ./ sum(b);
sprintf('log(%.20f) = %.20f', p(5), log(p(5)))
import numpy as np
b = np.array([1, 1, 2, 3, 5, 1, 1])
p = b.astype('float64') / np.sum(b)
print(f'log({p[4]:.20f}) = {np.log(p[4]):.20f}')

For my MacBook Pro 2020 with M1 chip, I get mismatch at 16th decimal digit.

log(0.35714285714285715079) = -1.02961941718115834732  # Matlab
log(0.35714285714285715079) = -1.02961941718115812527  # NumPy

I would like to get exactly the same results. Any idea, how to modify my Python code?


Solution

  • Both MATLAB and numpy by default use 64bit float that have a 52 bit mantissa. This means the smallest relative step between two float64 numbers is 2**-52 = 2.2e-16. This means any decimal after the 16th has no significance. The difference you're seeing is probably due to a sligtly different implementation. You can check this by using

    np.nextafter(a, 1)-a
    

    For a = np.log(0.35714285714285715079) you get 2.2e-16 which is roughly the size of the machine precision np.finfo(np.float64).eps.

    Even if you look at the input: You're providing more decimals than necessary to completely define a 64-bit float. We can set the number of displaye decimals to 100 and it will still just print 17 digits for this reason:

    >>> np.set_printoptions(precision=100)
    >>> np.array([0.35714285714285715079])
    array([0.35714285714285715])                       
    

    The difference between MATLAB and numpy might even be caused by reordering a sum, as floating point addition is not associative. If you do depend on the 16th decimal place then you should use something other than 64bit float. I'd recommend familiarizing yourself with how floating point types are implemented, as it is vital when working with scientific software. And if you'd like, I'd recommend taking a look at the source code of numpy, to see how it is implemented, and compare it to other open libraries.