Search code examples
rparameterscurve-fittingmodelingnls

Starting values for 4 parameter NLS - Chapman Richards function


*Note - I have read several of the posts on how to find starting values for NLS - however, I have not found one with an equation of this form (i.e. 4 parameters, exponent raised to a power)

I am struggling tremendously to find suitable starting values for the Chapman Richards equation, which is commonly used in forestry to model tree growth.

y(t) = α * (1 - β * exp(-k * t)^{1/(1-m)})

I typically try to find initial values by plotting a line with set parameters, and then tweaking it to fit the data more closely (Image 1). After this I would use the parameters in the function:

initial.test <- chapmanRichards(seq(0:15),42,0.95,0.28, 0.67)
plot(age,topHeight,type="p",xlab="year since planting",ylab="Dom height (m)", xlim = c(0,20), ylim = c(0, 50))
lines(seq(0:15),initial.test,col="red")

enter image description here

nls(topHeight ~ chapmanRichards(age,a,b,k,m),start=list(a=42,b=0.95,k=0.28,m=0.67))

In this case, the program is able to fit the curve with the starting values provided. The problem, however, is when the data is a bit noisy, and after 2 hours of fiddling with the initial test values, I still can't find good enough starting values (Image 2 shows a few attempts on another dataset.

enter image description here

Can anyone advise on what a good way would be to find suitable starting values? I have thought of creating a matrix that basically runs a sequence for each of the parameters and looping the nls with those starting values, but not sure how the code would look. Any other advice would be greatly appreciated!

PS - would this be something more suited to Excel - solver?


Solution

  • As @Roland pointed out in the comments the parameters in the equation shown in the question are not identifiable so assuming the equation is as he showed:

    y = a * (1 - b * exp(-k * t))^{1/(1-m)}
    

    take the log of both sides:

    log(y) ~ log(a) + (1/(1-m)) * log(1 - b * exp(-k*t))
    

    and let log(a) = A, 1/(1-m) = M and b = exp(k*B) giving:

    log(y) ~ A + M * log(1 - exp(k*(B-t))
    

    Since B is an offset and k is a scaling we can estimate them as B = mean(t) and k = 1/sd(t). Using algorithm = "plinear" we can avoid starting values for the linear parameters (A and M) provided we specify the right hand side as a matrix such that A times the first column plus M times the second column would give the predicted value. Thus we have:

    st <- list(B = mean(t), k = 1/sd(t))
    fm0 <- nls(log(y) ~ cbind(1, log(1 - exp(k*(B - t)))), start = st,
      algorithm = "plinear")
    

    and then back transform the coefficients so obtained to get the starting values for running the final nls.

    Also note that nls2 in the nls2 package can evaluate the model on a grid or at a random set of points to get starting values.