Search code examples
javapythonnumpyprobabilitynormal-distribution

Compute probability over a multivariate normal


My question addresses both mathematical and CS issues, but since I need a performant implementation I am posting it here.

Problem:

I have an estimated normal bivariate distribution, defined as a python matrix, but then I will need to transpose the same computation in Java. (dummy values here)

mean = numpy.matrix([[0],[0]])
cov = numpy.matrix([[1,0],[0,1]])

When I receive in inupt a column vector of integers values (x,y) I want to compute the probability of that given tuple.

value = numpy.matrix([[4],[3]])
probability_of_value_given_the_distribution = ???

Now, from a matematical point of view, this would be the integral for 3.5 < x < 4.5 and 2.5 < y < 3.5 over the probability density function of my normal.

What I want to know:

Is there a way to avoid the effective implementation of this, that implies dealing with expressions defined over matrices and with double integrals? Besides that it will take me a while if I had to implement it by myself, this would be computationally expensive. An approximate solution would be perfectly fine for me.

My reasonings:

In an univariate normal, one could simply use the cumulative distribution function (or even store its values for the standard one and then normalize), but unfortunately there appears not to be a closed cdf form for multivariates.

Another approach for univariate is to use the inverse of bivariate approximation (so, approximate a normal as a binomial), but extending this to the multivariate I can't figure out how to keep in count the covariances.

I really hope someone has already implemented this, I need it soon (finishing my thesis) and I couldn't find anything.


Solution

  • OpenTURNS provides an efficient implementation of the CDF of a multinormal distribution (see the code).

    import numpy as np
    
    mean = np.array([0.0, 0.0])
    cov = np.array([[1.0, 0.0],[0.0, 1.0]])
    

    Let us create the multinormal distribution with these parameters.

    import openturns as ot
    multinormal = ot.Normal(mean, ot.CovarianceMatrix(cov))
    

    Now let us compute the probability of the square [3.5, 4.5] x |2.5, 3.5]:

    prob = multinormal.computeProbability(ot.Interval([3.5,2.5], [4.5,3.5])) 
    print(prob)
    

    The computed probability is

    1.3701244220201715e-06