Search code examples
rnumerical-integrationspline

integrate quadratic b-splines in R


I am working with a function that depends on quadratic B-spline interpolation estimated up front by the the cobs function in the same R package. The estimated knots and corresponding coefficients are given in code. Further on, I require the integral of this function from 0 to some value, for example 0.6 or 0.7. Since my function is strictly positive, the integral value should increase if the upper bound of the integral increases. However this is not the case for some values, as shown when using 0.6 and 0.7

library(cobs)
b <- 0.6724027
xi1 <- 0.002541667
xi2 <- 2.509625
knots <- c(5.000010e-06, 8.700000e-05, 3.420000e-04, 1.344000e-03, 5.292000e-03, 2.082900e-02, 8.198800e-02, 3.227180e-01, 1.270272e+00, 5.000005e+00)
coef <- c(2.509493, 2.508141, 2.466733, 2.378368, 2.239769, 2.063977, 1.874705, 1.601780, 1.288163, 1.262683, 1.432729)
fn <- function(x) {
  z <- (2 - b) * (cobs:::.splValue(2, knots, coef, x, 0) - 2 * x * xi1) / xi2 - b
  return (z)
}

x <- seq(0, 0.7, 0.0001)
plot(x, fn(x), type = 'l')
integrate(f = fn, 0, 0.6)
# 0.1049019 with absolute error < 1.2e-15
integrate(f = fn, 0, 0.7)
# 0.09714124 with absolute error < 1.1e-15

I know I could integrate directly on the cobs:::.splValue function, and transform the results correspondingly. However, I am interested to know why this strange behaviour occurs.


Solution

  • I think that the algorithm used by the function "integrate" is not behaving well for those conditions. For example, if you modify the lower limits, it works as expected:

    > integrate(f = fn, 0.1, 0.6)
    0.06794357 with absolute error < 7.5e-16
    
    > integrate(f = fn, 0.1, 0.7)
    0.07432096 with absolute error < 8.3e-16
    

    This is common with numerical integration methods, you have to choose on a case by case basis. I'm using the trapezoidal rule to integrate over the same region and works well original code

    composite.trapezoid <- function(f, a, b, n) {
      if (is.function(f) == FALSE) {
        stop('f must be a function with one parameter (variable)')
      }
    
      h <- (b - a) / n
    
      j <- 1(:n - 1)
    
      xj <- a + j * h
    
      approx <- (h / 2) * (f(a) + 2 * sum(f(xj)) + f(b))
    
      return(approx)
    }
    
    > composite.trapezoid(f = fn, 0, 0.6, 10000)
    [1] 0.1079356
    > composite.trapezoid(f = fn, 0, 0.7, 10000)
    [1] 0.1143195
    

    If we analyze the behavior of the integral close to the 0.65 region, we can see that there is a problem with the first approach (it is not smooth):

    tst = sapply(seq(0.5, 0.8, length.out = 100), function(upper) {
      integrate(f = fn, 0, upper)[[1]]
    
    })
    plot(seq(0.5, 0.8, length.out = 100), tst)
    

    and that the trapezoid rule behaves better:

    tst2 = sapply(seq(0.5, 0.8, length.out = 100), function(upper) {
      composite.trapezoid(f = fn, 0, upper, 10000)[[1]]
    
    })
    plot(seq(0.5, 0.8, length.out = 100), tst2)