Search code examples
statisticsnumericalgamma-function

Fast incomplete gamma function


What is a fast way to calculate the incomplete gamma function, or at least a "good" approximation of it, in C++?

Background

What I ultimately need to calculate

Given a number of Bernoulli trails N, with probability p of success, I'm ultimately trying to calculate the probability of obtaining at most k successes, as a function of k. The cumulative binomial distribution F(k,N,p) gives this probability.

The need for speed

I need to calculate a few hundred thousand of these cumulative probabilities per second. Calculating the cumulative binomial distribution by straightforward summation is very computation-intensive for large N. Using the incomplete beta function is a lot better, but still quite computation intensive.

Exploitable constraints

I'm hoping the following constraints from the application domain can help with speeding up the calculation:

  • p < 0.01 (the distribution is always very skew)
  • N > 50

Poisson approximation

After some experimentation in Excel, I've learned that the Poisson approximation is excellent under the above conditions. I.e. B(N,p) at k is almost identical to Pois(Np) at k under the conditions of interest. This means I only need a function of 2 variables, no longer 3.

I understand that the cumulative Poisson distribution can be calculated in terms of the incomplete gamma function, which, judging by the source code in the cephes library, seems to be quite a lot simpler to calculate than the original incomplete beta function one would have had to calculate without the Poisson approximation. But it still isn't very simple and is an iterative numerical calculation. So now I'm looking for a fast way to calculate the incomplete gamma function. I'm wondering whether there isn't a closed-form expression that can approximate it reasonably well.

Required precision

20% relative error is quite acceptable on the integral/probability (considered from every k, in both directions).

I've considered using an interpolated look up table for the Poisson CDF directly, but evenly-spaced domain-points are probably less-than-ideal and the domain would also then have to be restricted to an arbitrary rectangle. An analytic function with quite a number of tweaked parameters is what I'm hoping to find ideally.


Solution

  • Instead of using the Gamma function, I concocted this approximation that transforms the Poisson variables into a standard normal variable:

    float poisson_z(float x, float mu){
        static const float twoThirds = 2.0f/3.0f;
        float w = sqrt((x+0.5f)/mu) - 1.0f;
        float coeff = w>=0.0f ? 0.085f : 0.15f;
        return (x-mu+twoThirds)/sqrtf(mu*(1.0f+w*(0.68f+w*coeff)));
    }
    

    There is no shortage of approximations for the standard normal distribution.