Specifically, are these cumulative product functions in pandas
and numpy
implemented in a robust way to handle underflow when multiplying lots of small numbers together? For example, are they using the log-sum-exp trick?
Thanks.
Unfortunately, no. @warren-weckesser 's comment shows this to not work.
np.array([1e-5, 1e-30, 1e-100, 1e-200, 1e50, 1e150]).cumprod()
# returns
array([1.0e-005, 1.0e-035, 1.0e-135, 0.0e+000, 0.0e+000, 0.0e+000])
The reason is that numpy floats support a smallest positive value of 2**-1022, or about 2.225e-308. Once your calculation becomes smaller than that, it is dropped to zero, which is what we see in the above output. The same is true for pandas.