Search code examples
rnlsnon-linear-regressionconvergence

Error in nonlinear least squeares in R - Logistic and Gompertz curves


I'm working on a model for variable y, in which I intend to use time as an explanatory variable. I've chosen a Gompertz and a logistic curve as candidates, but when I try to estimate the coefficients (using both nls and nls2), I end up getting different errors (singularity or step factor reduced below 'minFactor'). I would really appreciate any help. Here is my code and a deput version of the info object.

I chose the initial values according to the criteria in http://www.metla.fi/silvafennica/full/sf33/sf334327.pdf

library(nls2)

> dput(info)
structure(list(y = c(0.308, 0.279, 0.156, 0.214, 0.224, 0.222, 
0.19, 0.139, 0.111, 0.17, 0.155, 0.198, 0.811, 0.688, 0.543, 
0.536, 0.587, 0.765, 0.667, 0.811, 0.587, 0.617, 0.586, 0.633, 
2.231, 2.202, 1.396, 1.442, 1.704, 2.59, 2.304, 3.026, 2.7, 3.275, 
3.349, 3.936, 9.212, 8.773, 6.431, 6.983, 7.169, 9.756, 10.951, 
13.938, 14.378, 18.406, 24.079, 28.462, 51.461, 46.555, 39.116, 
43.982, 41.722), t = 1:53), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -53L))

summary(gomp_nls <- nls2(y ~ alpha*exp(-beta*exp(-gamma*t)), 
                 data = info, 
                 start = list(alpha = 40, beta = 4.9, gamma = 0.02), 
                 algorithm = "default")
        )

summary(logist_nls <- nls2(y ~ alpha/(1+beta*exp(-gamma*t)), 
                          data = info, 
                          start = list(alpha = 40, beta = 128, gamma = 0.02), 
                          algorithm = "default"))
        )

I'd appreciate any help


Solution

  • The "default" algorithm for nls2 is to use nls. You want to specify "brute-force" or one of the other algorithms for finding an initial value. The starting value should be a data frame of two rows such that it will fill in the hypercube so defined with potential starting values.

    It will then evaluate the residual sum of squares at each of those starting values and return the starting values at which the formula gives the least sum of squares.

    If you find that the result returned by nls2 is at the boundary of the region you defined then enlarge the region and try again. (You might not need this step if the starting value returned are good enough anyways.)

    Finally run nls with the starting values you found.

    library(nls2)
    
    ## 1
    
    fo1 <- y ~ alpha*exp(-beta*exp(-gamma*t))
    st1 <- data.frame(alpha = c(10, 100), beta = c(1, 100), gamma = c(0.01, 0.20))
    fm1.0 <- nls2(fo1, data = info, start = st1, algorithm = "brute-force")
    
    fm1 <- nls(fo1, data = info, start = coef(fm1.0))
    
    ## 2
    
    fo2 <- y ~ alpha/(1+beta*exp(-gamma*t))
    st2 <- data.frame(alpha = c(10, 1000), beta = c(1, 10000), gamma = c(0.01, 0.20))
    fm2.0 <- nls2(fo2, data = info, start = st2, algorithm = "brute-force")
    
    fm2 <- nls(fo2, data = info, start = coef(fm2.0))
    
    # plot both fits
    
    plot(y ~ t, info)
    lines(fitted(fm1) ~ t, info, col = "blue")
    lines(fitted(fm2) ~ t, info, col = "red")
    

    screenshot

    Note

    Note that for the data shown these two 2-parameter exponential models fit reasonably well so if you are only interested in the range where it rises exponentially then these could be alternatives to consider. (The first one below is better because the coefficients are more similar to each other. The second one may have scaling problems.)

    fm3 <- nls(y ~ a * exp(b/t), info, start = c(a = 1, b = 1))
    fm4 <- nls(y ~ a * t^b, info, start = c(a = .001, b = 6))