python scipy statistics numerical-integration numerical-stability

Numerically stable evaluation of log of integral of a function with extremely small values

If I have a random number Z that is defined as the sum of two other random numbers, X and Y, then the probability distribution of Z is the convolution of the probability distributions for X and Y. Convolution is basically to an integral of the product of the distribution functions. Often there is no analytic solution to the integral in the convolution, so it must be calculated with a basic quadrature algorithm. In pseudocode:

prob_z(z) = integrate(lambda t: prob_x(t) * prob_y(z-t), -inf, inf)

For a concrete example, the sum Z of a normally distributed variable X and a log normally distributed variable Y can be calculated with the following Python/Scipy code:

from scipy.integrate import quad
from scipy.stats import norm, lognorm
from scipy import log

prob_x = lambda x: norm.pdf(x, 0, 1)  # N(mu=0, sigma=1)
prob_y = lambda y: lognorm.pdf(y, 0.1, scale=10)  # LogN(mu=log(10), sigma=0.1)
def prob_z(z):
    return quad(lambda t: prob_x(t)*prob_y(z-t), -inf, inf)

Now I want to calculate the log probability. The naive solution is to simple do:

def log_prob_z(z):
    return log(prob_z(z))

However, this is numerically unstable. After about 39 standard deviations, probability distributions are numerically 0.0, so even if the log probability has some finite value, it can't be calculated outside this by simply taking the log of the probability. Compare norm.pdf(39, 1, 0) which is 0.0 to norm.logpdf(39, 1, 0) which is about -761. Clearly, Scipy doesn't compute logpdf as log(pdf)—it finds some other way—because otherwise it would return -inf, an inferior response. In the same way, I want to find another way for my problem.

(You may wonder why I care about the log likehood of values so far from the mean. The answer is parameter fitting. Fitting algorithms can get closer when the log likelihood is some hugely negative number, but nothing can be done when it is -inf or nan.)

The question is: does anyone know how I can rearrange log(quad(...)) so I don't compute quad(...) and thereby avoiding creating a 0.0 in the log?

Solution

The problem is that the values of the function you are integrating are too small to be represented in double precision, which is good only until 1e-308 or so.

mpmath to the rescue

When double-precision is not enough for numerical computations, mpmath, a library for arbitrary-precision floating point operations, is called for. It has its own quad routine, but you'll need to implement your pdf functions so they work at mpmath level (otherwise there won't be anything to integrate). There are many built-in functions, including normal pdf, so I'm going to use that for illustration.

Here I convolve two normal pdfs at distance 70 apart with SciPy:

z = 70
p = quad(lambda t: norm.pdf(t, 0, 1)*norm.pdf(z-t, 0, 1), -np.inf, np.inf)[0]

Sadly, p is exactly 0.0.

And here I do the same with mpmath, after import mpmath as mp:

z = 70
p = mp.quad(lambda t: mp.npdf(t, 0, 1)*mp.npdf(z-t, 0, 1), [-mp.inf, mp.inf])

Now p is an mpmath object that prints as 2.95304756048889e-543, well beyond double-precision scale. And its logarithm, mp.log(p), is -1249.22086778731.

SciPy-based alternative: logarithmic offset

If for some reason you can't use mpmath, you can at least try to "normalize" the function by moving its values into the double precision range. Here is an example:

z = 70
offset = 2*norm.logpdf(z/2, 0, 1)
logp = offset + np.log(quad(lambda t: np.exp(norm.logpdf(t, 0, 1) + norm.logpdf(z-t, 0, 1) - offset), -np.inf, np.inf)[0])

Here logp prints -1264.66566393 which is not as good as mpmath result (so we lost some of the function) but it's reasonable. What I did was:

calculate the logarithm of the maximal value of the logarithm of our function (this is the variable offset)
subtract this offset from the logarithm of pdf; this is the part norm.logpdf(t, 0, 1) + norm.logpdf(z-t, 0, 1) - offset
exponentiate the result, since we can't just put logarithm inside integral. Algebraically, this would be the same as product of pdfs times exp(-offset). But numerically, this is a number that is less likely to overflow; indeed, at t = z/2 it is exp(0)=1.
integrate normally; take the logarithm, add offset to the logarithm. Algebraically, the result is just the logarithm of the integral we wanted to take.