Search code examples
rnlsconvergence

nls - convergence error


For this dataset:

dat = structure(list(x = c(5L, 5L, 5L, 5L, 10L, 10L, 10L, 10L, 15L, 
15L, 15L, 15L, 17L, 17L, 17L, 17L, 20L, 20L, 20L, 20L, 20L, 20L, 
20L, 20L, 22L, 22L, 22L, 22L, 24L, 24L, 24L, 24L, 25L, 25L, 25L, 
25L, 27L, 27L, 27L, 27L, 30L, 30L, 30L, 30L, 35L, 35L, 35L, 35L), 
y = c(2.2, 2.2, 1.95, 1.9, 4.1, 3.95, 3.75, 3.4, 5.15, 4.6, 
4.75, 5.15, 3.7, 4.1, 3.9, 3.5, 7, 6.7, 6.7, 6.95, 4.95, 6, 6.45, 
6.4, 7, 4.45, 6.15, 6.4, 7, 6.6, 6.7, 7, 4.5, 4.7, 5.75, 4.35, 
5.4, 5.15, 5.7, 5.7, 0, 0, 0.5, 0, 0, 0, 0, 0)), .Names = c("x", "y"), 
row.names = c(6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 
15L, 16L, 17L, 34L, 35L, 36L, 37L, 18L, 19L, 20L, 21L, 38L, 39L, 
40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 22L, 23L, 24L, 
25L, 50L, 51L, 52L, 53L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L), 
class = "data.frame")

Where "x" is the temperature and "y" is the response variable of a biological process

I'm trying to fit this function

beta.reg<-function(x, Yopt,Tmin,Topt,Tmax, b1) {
Yopt*((x-Tmin)/(Topt-Tmin))^(b1*(Topt-Tmin)/(Tmax-Topt))*((Tmax-x) / (Tmax-Topt)) ^ b1
}

mod <- nls(y ~ beta.reg(x, Yopt,Tmin,Topt,Tmax, b1), data=dat,
       start=c(Yopt=6, Tmin=0.1, Topt=24, Tmax=30, b1=1),
       control=nls.control(maxiter=800))

But, I'm having this message error:

Error en numericDeriv(form[[3L]], names(ind), env) : Missing value or an infinity produced when evaluating the model

I've tried the same function with another similar dataset and fits correctly...

 rnorm<-(10)
 y <- c(20,60,70,49,10)
 rnorm<-(10)
 y <- c(20,60,70,49,10)
 dat<-data.frame(x = rep(c(15,20,25,30,35), times=5),
              rep = as.factor(rep(1:5, each=5)),
              y = c(y+rnorm(5), y+rnorm(5),y+rnorm(5),y+rnorm(5),y+rnorm(5)))

Could someone help me with this?

Session Info:

R version 3.1.1 (2014-07-10)
Platform: x86_64-pc-linux-gnu (64-bit)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] nlme_3.1-118        latticeExtra_0.6-26 RColorBrewer_1.0-5  lattice_0.20-29    

loaded via a namespace (and not attached):
[1] grid_3.1.1  tools_3.1.1

Solution

  • There are so many problems here that I doubt it can be covered adequately in an SO post, but this should get you started.

    First, it looks like you want Tmax < max(dat$x), e.g., < 35. This causes a problem because then Tmax - x < 0 for some values of x and when you try to raise a negative number to a power (in the second term of your formula), you get NA's. This is the cause of the error message.

    Second, convergence of a non-linear model depends on the model formula and also the data, so the fact that the process converges with one set of data but not another is completely irrelevant.

    Third, non-linear modelling iteratively minimizes the residual sum of squares as a function of the parameters. If the RSS surface has local minima, and your start is close to one, the algorithms will find it. But only the global minimum is the true solution. Your problem has many, many local minima.

    Fourth, nls(...) uses the Gauss Newton method by default. Gauss Newton is notoriously unstable with shifting parameters (parameters which are added to or subtracted from the predictor, so Tmin and Tmax in your case). Fortunately, the minpak.lm package implements the Levenberg Marquardt method, which is much more stable under these conditions. The nlsLM(...) function in that package uses the same calling sequence as nls(...) and returns and object of type nls, so all the methods for that class of object work as well. Use that.

    Fifth, a fundamental assumption in non-linear regression (in fact all least-squares regression) is that the residuals are normally distributed. So you have to validate any solution using a Q-Q plot.

    Sixth, your model has a perverse set of characteristics. When Tmin -> -Inf the first term in the model approaches 1. It turns out that this yields a lower RSS than any other value of Tmin less than min(dat$x), so the algorithms all tend to drive Tmin to large negative values. You can see this easily as follows:

    library(minpack.lm)
    mod <- nlsLM(y ~ beta.reg(x, Yopt,Tmin,Topt,Tmax, b1), data=dat,
                 start=c(Yopt=6,Tmin=0,Topt=24,Tmax=50, b1=1),
                 control=nls.lm.control(maxiter=1024,maxfev=1024))
    coef(summary(mod))
    #         Estimate   Std. Error     t value     Pr(>|t|)
    # Yopt    6.347019    0.2919686 21.73870235 8.055342e-25
    # Tmin -155.530098 2204.0011003 -0.07056716 9.440694e-01
    # Topt   21.157545    0.6702713 31.56564484 2.240134e-31
    # Tmax   35.000000   11.4838614  3.04775537 3.933164e-03
    # b1      3.321326    9.1844548  0.36162468 7.194035e-01
    sum(residuals(mod)^2)
    # [1] 50.24696
    
    par(mfrow=c(1,2))
    plot(y~x,dat)
    with(as.list(coef(mod)),curve(beta.reg(x, Yopt,Tmin,Topt,Tmax, b1),add=TRUE))
    qqnorm(residuals(mod))
    

    This looks like a pretty decent fit but it isn't: the Q-Q plot shows that the residuals are not remotely normal. The fact that both Tmin and b1 are very poorly estimated, and the value for Tmin is not physically meaningful, are problems with the data, not the fit.

    Seventh, it turns out that the fit above is actually a local minimum. We can see this by doing a grid search on Tmin, Tmax, and b1 (leaving out Yopt and Topt to save time, and because these parameters are well estimated regardless of the starting point).

    init <- c(Yopt=6, Topt=24)
    grid <- expand.grid(Tmin= seq(0,4,len=100),
                        Tmax= seq(35,100,len=10),
                        b1  = seq(1,10,len=10))
    mod.lst <- apply(grid,1,function(gr){
      nlsLM(y ~ beta.reg(x, Yopt,Tmin,Topt,Tmax, b1), data=dat,
            start=c(init,gr),control=nls.control(maxiter=800)) })
    rss <- sapply(mod.lst,function(m)sum(residuals(m)^2))
    mod <- mod.lst[[which.min(rss)]]   # fit with lowest RSS
    coef(summary(mod))
    #        Estimate   Std. Error      t value     Pr(>|t|)
    # Yopt   6.389238    0.2534551 25.208557840 2.177168e-27
    # Topt  22.636505    0.5605621 40.381798589 7.918438e-36
    # Tmin  35.000002  104.6221159  0.334537316 7.396005e-01
    # Tmax  36.234602  133.4987344  0.271422809 7.873647e-01
    # b1   -41.512912 7552.0298633 -0.005496921 9.956395e-01
    sum(residuals(mod)^2)
    # [1] 34.24019
    
    plot(y~x,dat)
    with(as.list(coef(mod)),curve(beta.reg(x, Yopt,Tmin,Topt,Tmax, b1),add=TRUE))
    qqnorm(residuals(mod))
    

    Mathematically, this is a distinctly superior fit: RSS is lower and the residuals are much more nearly normally distributed. Again, the fact that the parameters are not well estimated, and are not physically meaningful, is a problem with the data (and perhaps the model formula), not the fitting process.

    All of the foregoing suggests that there is something wrong with your model. One problem with it, mathematically, is that the function is undefined for x outside of (Tmin,Tmax). Since you have data out to x=35, the fitting algorithm will never yield Tmax < 35 (if it converges). An approach to deal with this changes your model function slightly to clip to 0 outside that range. (I have no idea if this is legitimate based on the physics of your problem, though...).

    beta.reg<-function(x, Yopt,Tmin,Topt,Tmax, b1) {
      ifelse(x>Tmax,0,
        ifelse(x<Tmin,0,
          Yopt*((x-Tmin)/(Topt-Tmin))^(b1*(Topt-Tmin)/(Tmax-Topt))*((Tmax-x) / (Tmax-Topt)) ^ b1
      ))
    }
    

    Running the code above with this function yields:

    coef(summary(mod))
    #         Estimate   Std. Error     t value     Pr(>|t|)
    # Yopt   6.1470413   0.21976766   27.970636 3.202940e-29
    # Tmin -52.8172658 184.16899439   -0.286787 7.756528e-01
    # Topt  23.0777898   0.63750721   36.200045 7.638121e-34
    # Tmax  30.0039413   0.02529877 1185.984187 1.038918e-98
    # b1     0.5966129   0.32439982    1.839128 7.280793e-02
    
    sum(residuals(mod)^2)
    # [1] 28.10144
    
    par(mfrow=c(1,2))
    plot(y~x,dat)
    with(as.list(coef(mod)),curve(beta.reg(x, Yopt,Tmin,Topt,Tmax, b1),add=TRUE))
    qqnorm(residuals(mod))
    qqline(residuals(mod))
    

    In fact the grid search yields exactly the same result independent of starting point. Note that RSS is lower than any of the results with the earlier model, and that b1 is much better estimated (and very different from the estimate with the earlier model function). The residuals are still not normal, but in this case I would want to inspect the data for outliers.