Search code examples
rconfidence-intervalnls

Cannot use predFit to get confidence interval data


I'm trying to calculate the confidence interval from my nls model. And I tried the same code as this checked answer: How to calculate confidence intervals for Nonlinear Least Squares in r?

But I get a strange error:

Error in eval(form[[3]]) : object 'a' not found
4.
eval(form[[3]])
3.
eval(form[[3]])
2.
predFit.nls(gloss.nls, newdata = data.frame(stimulus = seq(0, 
1, by = 0.1)), interval = "confidence", level = 0.9)
1.
predFit(gloss.nls, newdata = data.frame(stimulus = seq(0, 1, 
by = 0.1)), interval = "confidence", level = 0.9)

I nearly use the same code as the answer above, only differing in data:

gloss.nls <- nls(
                normP ~ a[1]*stimulus^3+a[2]*stimulus,
                data = data.mlds %>% filter(overall == TRUE),
                start = list(a=c(0.4,0.6))
                )

predFit(gloss.nls, newdata = data.frame(stimulus=seq(0, 1, by = 0.1)), interval = "confidence", level= 0.9)

And Here's my data:

id  rank  stimulus  pscale  normP  overall
0   1   0.000   0.0000000   0.00000000  TRUE
0   2   0.125   0.3151757   0.05889716  TRUE
0   3   0.250   0.9225827   0.17240385  TRUE
0   4   0.375   1.4164383   0.26469110  TRUE
0   5   0.500   1.7400011   0.32515557  TRUE
0   6   0.625   2.3531344   0.43973235  TRUE
0   7   0.750   3.1662257   0.59167546  TRUE
0   8   0.875   4.3538122   0.81360082  TRUE
0   9   1.000   5.3512879   1.00000000  TRUE
1   1   0.000   0.0000000   0.00000000  FALSE

Solution

  • Short answer

    Try

    a <- c(0.4,0.6)
    predFit(gloss.nls, newdata = data.frame(stimulus=seq(0, 1, by = 0.1)), interval = "confidence", level= 0.9)
    

    Long answer

    First, note that your model is linear in parameters, so you can just estimate the model in plain ols, where confidence intervals are straightforward.

    library(tidyverse)
    gloss.lm  <-  lm(normP ~ I(stimulus^3)+stimulus,
                    data = data.mlds %>% filter(overall == TRUE)  )
    predict(gloss.lm, newdata = data.frame(stimulus=seq(0, 1, by = 0.1)), interval = "confidence", level= 0.9)
               fit         lwr        upr
    1  0.005554547 -0.02791979 0.03902889
    2  0.061136954  0.03572392 0.08654999
    3  0.119410056  0.09931945 0.13950067
    4  0.183064551  0.16435972 0.20176938
    5  0.254791132  0.23459593 0.27498634
    6  0.337280497  0.31518226 0.35937873
    7  0.433223342  0.41047149 0.45597519
    8  0.545310361  0.52354420 0.56707652
    9  0.676232250  0.65548488 0.69697962
    10 0.828679707  0.80399326 0.85336616
    11 1.005343426  0.96738940 1.04329745
    

    If you insist on estimating the model using nonlinear least squares, then

    gloss.nls <-  nls(normP ~ a[1]*stimulus^3+a[2]*stimulus,
                    data = data.mlds %>% filter(overall == TRUE) ,
                    start=list(a=c(.5, .5)) )
    

    Annoyingly, predict.nls does not seem have confidence interval calculation, so this does not produce confidence intervals.

    predict(gloss.nls, newdata = data.frame(stimulus=seq(0, 1, by = 0.1)), interval = "confidence", level= 0.9)
     [1] 0.00000000 0.05704647 0.11672200 0.18165566 0.25447650 0.33781360
     [7] 0.43429601 0.54655279 0.67721301 0.82890574 1.00426003
    

    Luckily, investr::predFit has an implementation for confidence interval calculation.

    library(investr)
    predFit(gloss.nls, interval='prediction', newdata = data.frame(stimulus=seq(0, 1, by = 0.1), confidence=.9))
    

    ... but this returns an error (which you encountered in your question).

    I did not dig too deep into predFit.nls code but it seems that it predFit silently runs gloss.nls$call in the background, and if it does not find everything it needs, it returns a weird error. It is enough to create an object into the namespace with the same shape as a to resolve the error.

    a <- coef(gloss.nls)
    investr::predFit(gloss.nls, interval='prediction', newdata = data.frame(stimulus=seq(0, 1, by = 0.1), confidence=.9))
                 fit          lwr        upr
     [1,] 0.00000000 -0.050130071 0.05013007
     [2,] 0.05704647  0.006916398 0.10717654
     [3,] 0.11672200  0.066591929 0.16685207
     [4,] 0.18165566  0.131525585 0.23178573
     [5,] 0.25447650  0.204346430 0.30460657
     [6,] 0.33781360  0.287683526 0.38794367
     [7,] 0.43429601  0.384165935 0.48442608
     [8,] 0.54655279  0.496422720 0.59668286
     [9,] 0.67721301  0.627082944 0.72734309
    [10,] 0.82890574  0.778775670 0.87903581
    [11,] 1.00426003  0.954129960 1.05439010
    

    Interestingly, the values in a do not make any difference. Try, e.g. a <- c(7500,-100) and you will get the same results. This might be a bug in investr?

    a <- c(7500,-100)
    predFit(gloss.nls, interval='prediction', newdata = data.frame(stimulus=seq(0, 1, by = 0.1), confidence=.9))
                 fit          lwr        upr
     [1,] 0.00000000 -0.050130071 0.05013007
     [2,] 0.05704647  0.006916398 0.10717654
     [3,] 0.11672200  0.066591929 0.16685207
     [4,] 0.18165566  0.131525585 0.23178573
     [5,] 0.25447650  0.204346430 0.30460657
     [6,] 0.33781360  0.287683526 0.38794367
     [7,] 0.43429601  0.384165935 0.48442608
     [8,] 0.54655279  0.496422720 0.59668286
     [9,] 0.67721301  0.627082944 0.72734309
    [10,] 0.82890574  0.778775670 0.87903581
    [11,] 1.00426003  0.954129960 1.05439010
    

    Data:

    data.mlds <- structure(list(id = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L),
        rank = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 1L), stimulus = c(0,
        0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1, 0), pscale = c(0,
        0.3151757, 0.9225827, 1.4164383, 1.7400011, 2.3531344, 3.1662257,
        4.3538122, 5.3512879, 0), normP = c(0, 0.05889716, 0.17240385,
        0.2646911, 0.32515557, 0.43973235, 0.59167546, 0.81360082,
        1, 0), overall = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,
        TRUE, TRUE, FALSE)), row.names = c(NA, -10L), class = "data.frame")