Search code examples
rfunctionintegrate

Integrate function giving very odd result


I am trying to find the area under a curve defined by the overlap of two normal distributions, each with their own mean and std. dev. This is what the code looks like that I have set up so far.

 min.f1f2 <- function(x, mu1, mu2, sd1, sd2)
 {f1 <- dnorm(x, mean=mu1, sd=sd1)
 f2 <- dnorm(x, mean=mu2, sd=sd2)
 pmin(f1, f2)}

When I try and integrate that function from -Inf to +Inf, with mu1=30, mu2=30, sd1=1, and sd2=2, it looks like this:

integrate(min.f1f2, -Inf, Inf, mu1=30, mu2=30, sd1=1, sd2=2)

This gives a result of 0.6773251 with absolute error < 7.6e-05. This seems all correct and good. However, when I translate the distribution means to 31:

integrate(min.f1f2, -Inf, Inf, mu1=31, mu2=31, sd1=1, sd2=2)

I get a result of: 8.972702e-06 with absolute error < 1.6e-05.

Given this is a direct translation, my thought is that this integration should find the same area under the curve of the overlap between the two distributions. However, R does not seem to agree. Is this something I have have caused, or is this some quirk of the integrate function that I have run afoul of?

Thanks for any light that you can shed on this.


Solution

  • The numerical integration function picks a finite number of points at which to evaluate the function. When you go from -Inf to Inf, the distance between successive blocks may be quite large and coarse. When you shift the distribution, it will be broken into chunks differently and produce different answers. Instead, if you make the lower limit min(mu1-5*sd1, mu2-5*sd2) and the upper limit be max(mu1+5*sd1, mu2+5*sd2) you will get more comparable answers.

    integrate(min.f1f2, 21,41,mu1=31, mu2=31, sd1=1, sd2=2)
    0.6773252 with absolute error < 4.8e-05
    integrate(min.f1f2, 20,40,mu1=30, mu2=30, sd1=1, sd2=2)
    0.6773252 with absolute error < 4.8e-05