Search code examples
pythonscipynumericalsoftmaxunderflow

Evaluate log(1 - normal_cdf(x)) in a numerically stable way


How to evaluate log(1 - normal_cdf(x)) in a numerically stable way? Here, normal_cdf is the cumulative distribution function of the standard Normal distribution.

For example, in Python:

import scipy 
from scipy.stats import norm

np.log(1 - norm.cdf(10))

gives -inf with RuntimeWarning: divide by zero encountered in log since norm.cdf(10) is almost equal to 1. Is there a function like logsumexp that can avoid numerical under-flow?


Solution

  • Since the normal distribution is symmetric about 0, we have

    1 - F(x) = P(X > x)
             = P(X < -x)
             = F(-x)
    

    Hence

    np.log(1 - norm.cdf(10)) = np.log(norm.cdf(-10))
                             = norm.logcdf(-10)