Search code examples
matlabintegral

integral2 returns 0 when it shouldn't


I am having the following issue. I am trying to integrate a pdf function of bivariate normal over the entire support. However, Matlab returns 0 for d, which shouldn't be the case. What is wrong with my approach?

The code is below:

mu1=100;
mu2=500;
sigma1=1;
sigma2=2;
rho=0.5;
var1=sigma1^2;
var2=sigma2^2;

pdf = @(x,y) (1/(2*pi*sigma1*sigma2*sqrt(1-rho^2))).*exp((-1/(2*(1-rho^2)))*(((x-mu1).^2/var1)+((y-mu2).^2/var2)-((2*rho*(x-mu1).*(y-mu2))/(sigma1*sigma2))));

d = integral2(pdf,-Inf,Inf,-Inf,Inf)

Solution

  • As @Andras Deak commented, the "exponentials cut off very fast away from the peak". As a matter of fact, you can visualize it:

    mu1=100;
    mu2=500;
    sigma1=1;
    sigma2=2;
    rho=0.5;
    var1=sigma1^2;
    var2=sigma2^2;
    
    pdf = @(x,y) (1/(2*pi*sigma1*sigma2*sqrt(1-rho^2))).*exp((-1/(2*(1-rho^2)))*(((x-mu1).^2/var1)+((y-mu2).^2/var2)-((2*rho*(x-mu1).*(y-mu2))/(sigma1*sigma2))));
    figure
    fsurf(pdf,[90 110 490 510])
    figure
    fsurf(pdf,[0 200 400 600])
    

    In the first figure, the limits are close to the means you provided. You can see the shape of the bivariate normal: enter image description here

    If you extend the limits, you will see what it looks like a discontinuity: enter image description here

    The built-in integral functions try to evaluate the integrals, but if your limits are -inf and inf, your function is zero almost everywhere, with a discontinuity close to the means.

    To treat singularities, you should break your domain, as suggested by MATLAB. Since the function is zero almost everywhere, you can integrate only around the means:

    d = integral2(pdf,90,110,490,510)
    
    > d =
    > 
    >     1.0000
    

    You can also write it as a function of your variables. The empirical rule states that 99.7% of your data is within 3 standard deviations from the means, so:

    d = integral2(pdf,mu1-3*sigma1,mu1+3*sigma1,mu2-3*sigma2,mu2+3*sigma2)
    
    > d =
    > 
    >     0.9948
    

    which will get you a pretty good result. We can elaborate more. In the wikipedia page of the empirical rule, the expression

    erf(x/sqrt(2))
    

    will give the "Expected fraction of population inside range mu+-x*sigma". For the short precision shown as standard in MATLAB, if you choose, say x=5, you will get:

    x = 5;
    erf(x/sqrt(2))
    
    > ans =
    > 
    >     1.0000
    

    Pretty much every data is contained within 5 standard deviations. So, you may neglect the domain outside this range in the double integration to avoid the (almost) singularity.

    d = integral2(pdf,mu1-x*sigma1,mu1+x*sigma1,mu2-x*sigma2,mu2+x*sigma2)
    
    > d =
    > 
    >     1.0000