I would like to fit an exponential decay function in R to the following data:
data <- structure(list(x = 0:38, y = c(0.991744340878828, 0.512512332368168,
0.41102449265681, 0.356621905557202, 0.320851602373477, 0.29499198506227,
0.275037747162642, 0.25938850981822, 0.245263623938863, 0.233655093612007,
0.224041426946405, 0.214152907133301, 0.207475138903635, 0.203270738895484,
0.194942528735632, 0.188107106969046, 0.180926819430008, 0.177028560207711,
0.172595416846822, 0.166729221891201, 0.163502461048814, 0.159286528409165,
0.156110097827889, 0.152655498715612, 0.148684858095915, 0.14733605355542,
0.144691873223729, 0.143118852619617, 0.139542186417186, 0.137730138713745,
0.134353615271572, 0.132197800438632, 0.128369567159113, 0.124971834736476,
0.120027536018095, 0.117678812415655, 0.115720611113327, 0.112491329844252,
0.109219168085624)), class = "data.frame", row.names = c(NA,
-39L), .Names = c("x", "y"))
I've tried fitting with nls but the generated curve is not close to the actual data.
It would be very helpful if anyone could explain how to work with such nonlinear data and find a function of best fit.
Try y ~ .lin / (b + x^c)
. Note that when using "plinear"
one omits the .lin
linear parameter when specifying the formula to nls
and also omits a starting value for it.
Also note that the .lin
and b
parameters are approximately 1 at the optimum so we could also try the one parameter model y ~ 1 / (1 + x^c)
. This is the form of a one-parameter log-logistic survival curve. The AIC for this one parameter model is worse than for the 3 parameter model (compare AIC(fm1)
and AIC(fm3)
) but the one parameter model might still be preferable due to its parsimony and the fact that the fit is visually indistinguishable from the 3 parameter model.
opar <- par(mfcol = 2:1, mar = c(3, 3, 3, 1), family = "mono")
# data = data.frame with x & y col names; fm = model fit; main = string shown above plot
Plot <- function(data, fm, main) {
plot(y ~ x, data, pch = 20)
lines(fitted(fm) ~ x, data, col = "red")
legend("topright", bty = "n", cex = 0.7, legend = capture.output(fm))
title(main = paste(main, "- AIC:", round(AIC(fm), 2)))
}
# 3 parameter model
fo3 <- y ~ 1/(b + x^c) # omit .lin parameter; plinear will add it automatically
fm3 <- nls(fo3, data = data, start = list(b = 1, c = 1), alg = "plinear")
Plot(data, fm3, "3 parameters")
# one parameter model
fo1 <- y ~ 1 / (1 + x^c)
fm1 <- nls(fo1, data, start = list(c = 1))
Plot(data, fm1, "1 parameter")
par(read.only = opar)
Adding the solutions in the other answers we can compare the AIC values. We have labelled each solution by the number of parameters it uses (the degrees of freedom would be one greater than that) and have reworked the log-log solution to use nls
instead of lm
and have a LHS of y since one cannot compare the AIC values of models having different left hand sides or using different optimization routines since the log likelihood constants used could differ.
fo2 <- y ~ exp(a + b * log(x+1))
fm2 <- nls(fo2, data, start = list(a = 1, b = 1))
fo4 <- y ~ SSbiexp(x, A1, lrc1, A2, lrc2)
fm4 <- nls(fo4, data)
aic <- AIC(fm1, fm2, fm3, fm4)
aic[order(aic$AIC), ]
giving from best AIC (i.e. fm3) to worst AIC (i.e. fm2):
df AIC
fm3 4 -329.35
fm1 2 -307.69
fm4 5 -215.96
fm2 3 -167.33