Search code examples
rconfidence-intervalnumerical-integrationnlsnon-linear-regression

How to integrate (AUC) nls model and Monte-Carlo confidence interval in R


I'm trying to integrate (resolve the area under) a non-linear function (from nls()) from x= 0 to infinity in R. However, R's integrate function calls for a function (f).

In short, I'd like to do something approximating:

integrate(my.nls, lower = 0L, upper = Inf)

But unfortunately, my.nls is actually a fitted model object, not a function. I considered using a smoothing spline to interpolate and then integrating the resulting function. But I'd much prefer to use the true nls function instead of an approximation. Besides, given the infinite nature of the integration I have to be very careful with extrapolation in the positive direction.

If possible, the ideal technique would be to be able to integrate the results of nls and other functions as well, for example, the area under the simulated 97.5% confidence interval as calculated from the propagate package's predictNLS function.

I'm rather new to R and this is only my second post on SO, I think, so please forgive me if this is a trivial or foolish question or I've committed some other sin. Thus far, inappropriate usage of as.function or function(){predict(my.nls()} have not gotten me anywhere and I would greatly appreciate any assistance.

Below is a short example that should serve to illustrate my problem:

### Make up some data
x <- seq(from = 10, to = 1, length.out = 15)+(rnorm(15)+2)
y <- seq(from = 1, to = 10, length.out = 15)+(rnorm(15)+2)

### Fit an nls model, in this case, just a plain linear one.
my.nls <- nls(y~m*x+b, start = c(m=-1, b=100))

### Get confidence intervals from propagate package, might take a couple 
#seconds to run. Only serves to illustrate the type of values, the 
#function of which, I'd like to integrate (see my.preds$summary)

library(propagate) 

my.preds <- predictNLS(my.nls, newdata = data.frame("x" = x))

### Integrate (totally not right, just (hopefully) illustrating 
#the idea of what I'd like to do)

#exact.fn.auc <- integrate(my.nls, lower = 0L, upper = Inf)
#upperCI.fn.auc <- integrate(predictNLS(my.nls)$summary$Sim.97.5%, lower = 0L, upper = Inf)

PS: I recognize that the syntax in the last two lines are very wrong, I'm just trying to show where the values represented by the function would come from if they had been calculated alone. If there's any question as to what I'm getting at, please ask and I'll try to rephrase my problem.

PPS: It's very likely I'm going about this entirely from the wrong direction (though the types of models I must fit are in truth, non-linear [unlike the one illustrated above], and I would like to get the area below the mean function and its confidence intervals in some way), if you have any suggestions as to other approaches, those are welcome as well. The problem I have with splines is that my true models go ~asymptotic as they approach y = 0, and given I'm going to Inf, small aberrations in extrapolation resolve some really very different values under the curve.


Solution

  • The main issue is indeed the fact that integrate needs a function and that is not what you've tried to provide. Another issue, at least in this example, is that the integral is divergent when going up to Inf.

    Restricting attention to [0, 10], for the first case we have

    integrate(function(p) 
      predict(my.nls, data.frame(x = p)),
      lower = 0, upper = 10)
    # 102.0578 with absolute error < 1.1e-12
    

    In the second one

    integrate(function(p) 
      predictNLS(my.nls, newdata = data.frame(x = p), do.sim = FALSE)$summary$`Prop.97.5%`,
      lower = 0, upper = 10)
    # 113.9549 with absolute error < 1.7e-06
    

    where I also added do.sim = FALSE as not to use Monte Carlo as that takes quite a bit longer, but you may of course adjust the parameters (e.g., the number of Monte Carlo iterations nsim).