Search code examples
rnumerical-integration

numerical integration of a tricky function


The prob package numerically evaluates characteristic functions for base R distributions. For almost all distributions there are existing formulas. For a few cases, though, no closed-form solution is known. Case in point: the Weibull distribution (but see below).

For the Weibull characteristic function I essentially compute two integrals and put them together:

fr <- function(x) cos(t * x) * dweibull(x, shape, scale)
fi <- function(x) sin(t * x) * dweibull(x, shape, scale)
Rp <- integrate(fr, lower = 0, upper = Inf)$value
Ip <- integrate(fi, lower = 0, upper = Inf)$value
Rp + (0+1i) * Ip

Yes, it's clumsy, but it works surprisingly well! ...ahem, most of the time. A user reported recently that the following breaks:

cfweibull(56, shape = 0.5, scale = 1)

Error in integrate(fr, lower = 0, upper = Inf) : 
  the integral is probably divergent

Now, we know that the integral isn't divergent, so it must be a numerical problem. With some fiddling I could get the following to work:

fr <- function(x) cos(56 * x) * dweibull(x, 0.5, 1)

integrate(fr, lower = 0.00001, upper = Inf, subdivisions=1e7)$value
[1] 0.08024055

That's OK, but it isn't quite right, plus it takes a fair bit of fiddling which doesn't scale well. I've been investigating this for a better solution. I found a recently published "closed-form" for the characteristic function with scale > 1 (see here), but it involves Wright's generalized confluent hypergeometric function which isn't implemented in R (yet). I looked into the archives for integrate alternatives, and there's a ton of stuff out there which doesn't seem very well organized.

As part of that searching it occurred to me to translate the region of integration to a finite interval via the inverse tangent, and voila! Check it out:

cfweibull3 <- function (t, shape, scale = 1){
  if (shape <= 0 || scale <= 0) 
    stop("shape and scale must be positive")
  fr <- function(x) cos(t * tan(x)) * dweibull(tan(x), shape, scale)/(cos(x))^2
  fi <- function(x) sin(t * tan(x)) * dweibull(tan(x), shape, scale)/(cos(x))^2
  Rp <- integrate(fr, lower = 0, upper = pi/2, stop.on.error = FALSE)$value
  Ip <- integrate(fi, lower = 0, upper = pi/2, stop.on.error = FALSE)$value
  Rp + (0+1i) * Ip
}

> cfweibull3(56, shape=0.5, scale = 1)
[1] 0.08297194+0.07528834i

Questions:

  1. Can you do better than this?
  2. Is there something about numerical integration routines that people who are expert about such things could shed some light on what's happening here? I have a sneaking suspicion that for large t the cosine fluctuates rapidly which causes problems...?
  3. Are there existing R routines/packages which are better suited for this type of problem, and could somebody point me to a well-placed position (on the mountain) to start the climb?

Comments:

  1. Yes, it is bad practice to use t as a function argument.
  2. I calculated the exact answer for shape > 1 using the published result with Maple, and the brute-force-integrate-by-the-definition-with-R kicked Maple's ass. That is, I get the same answer (up to numerical precision) in a small fraction of a second and an even smaller fraction of the price.

Edit:

I was going to write down the exact integrals I'm looking for but it seems this particular site doesn't support MathJAX so I'll give links instead. I'm looking to numerically evaluate the characteristic function of the Weibull distribution for reasonable inputs t (whatever that means). The value is a complex number but we can split it into its real and imaginary parts and that's what I was calling Rp and Ip above.

One final comment: Wikipedia has a formula listed (an infinite series) for the Weibull c.f. and that formula matches the one proved in the paper I referenced above, however, that series has only been proved to hold for shape > 1. The case 0 < shape < 1 is still an open problem; see the paper for details.


Solution

  • You may be interested to look at this paper, which discuss different integration methods for highly oscillating integrals -- that's what you are essentially trying to compute: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.8.6944

    Also, another possible advice, is that instead of infinite limit you may want to specify a smaller one, because if you specify the precision that you want, then based on the cdf of the weibull you can easily estimate how much of the tail you can truncate. And if you have a fixed limit, then you can specify exactly (or almost) the number of subdivisions (e.g. in order to have a few(4-8) points per period).