Search code examples
pythonpandasnumpyfloating-pointunderflow

Python: Do pandas.DataFrame.comprod() and numpy.comprod() handle numerical underflow?


Specifically, are these cumulative product functions in pandas and numpy implemented in a robust way to handle underflow when multiplying lots of small numbers together? For example, are they using the log-sum-exp trick?

Thanks.


Solution

  • Unfortunately, no. @warren-weckesser 's comment shows this to not work.

    np.array([1e-5, 1e-30, 1e-100, 1e-200, 1e50, 1e150]).cumprod()
    
    # returns
    array([1.0e-005, 1.0e-035, 1.0e-135, 0.0e+000, 0.0e+000, 0.0e+000])
    

    The reason is that numpy floats support a smallest positive value of 2**-1022, or about 2.225e-308. Once your calculation becomes smaller than that, it is dropped to zero, which is what we see in the above output. The same is true for pandas.