I would like to fit a rational function to my data:
data:
[1] 2.000000 3.000000 2.333333 1.750000 2.000000 1.833333 2.416667 1.916667
[9] 1.750000 2.166667 2.116667 1.916667 1.944444 1.611111 1.722222 1.777778
[17] 1.877778 1.944444 1.958333 1.833333 2.041667 2.020833 1.908333 1.916667
[25] 1.733333 1.833333 1.800000 1.933333 1.893333 1.866667 1.888889 1.805556
[33] 1.833333 1.847222 1.822222 1.805556 1.833333 1.904762 1.880952 1.833333
[41] 1.804762 1.809524 1.708333 1.708333 1.750000 1.708333 1.683333 1.687500
[49] 1.611111 1.666667 1.648148 1.611111 1.611111 1.611111 1.650000 1.600000
[57] 1.650000 1.625000 1.630000 1.616667 1.469697 1.560606 1.590909 1.651515
[65] 1.651515 1.651515 1.513889 1.555556 1.625000 1.638889 1.647222 1.652778
[73] 1.679487 1.717949 1.705128 1.698718.
The model I would like to fit is the following:
Model <- function(t, a, b, c, d) { (a + b*t)/(1 + c*t + d*t^2) }
I know that I firstly have to give a starting list of a, b, c... for nls but I really don't know how to set the parameters. Since I'm not an expert I found in this http://www.css.cornell.edu/faculty/dgr2/teach/R/R_rat.pdf document a useful guide. But at some point it says:
"Given a set of ordered pairs (ti,yi) where in general there are repeated measurements at each value of t, the parameters of a rational function can be fitted by non-linear least-squares estimation, for example with the nls method in R. One we have the four parameters, we can compute the value of t at which this is maximized, by computing the first derivative....".
Although I don't report additional data I have another column that represents time (integers from 1:76 representing years).
Can anyone help me?
Best
E.
The model is not fully specified in the question but assuming the model in the code below and the data shown reproducibly in Note 2 below if we set c = d = 0 then it is a linear model so we can use the coefficients from a linear model fit as starting values:
fm1 <- lm(y ~ t)
st2 <- list(a = coef(fm1)[[1]], b = coef(fm1)[[2]], c = 0, d = 0)
fm2 <- nls(y ~ Model(t, a, b, c, d), start = st2)
giving:
> fm2
Nonlinear regression model
model: y ~ Model(t, a, b, c, d)
data: parent.frame()
a b c d
2.5097712 0.6038808 0.3205409 0.0008663
residual sum-of-squares: 1.684
Number of iterations to convergence: 16
Achieved convergence tolerance: 8.029e-06
Looking at the fit graphically:
# model is shown in red. See Note 1 for fm4 (blue) and fm5 (green) models.
plot(y ~ t)
lines(fitted(fm2) ~ t, col = "red")
lines(fitted(fm4) ~ t, col = "blue")
lines(fitted(fm5) ~ t, col = "green")
legend("topright", c("fm2", "fm4", "fm5"), col = c("red", "blue", "green"), lty = 1)
The following is a different model which fits almost as well but only uses 3 parameters. See blue line on graph above.
fm3 <- lm(log(y) ~ log(t))
st4 <- list(a = coef(fm3)[[1]], b = 0, c = coef(fm3)[[2]])
fm4 <- nls(y ~ exp(a + b/t + c*log(t)), start = st4)
> fm4
Nonlinear regression model
model: y ~ exp(a + b/t + c * log(t))
data: parent.frame()
a b c
0.9845 -0.1767 -0.1157
residual sum-of-squares: 1.685
Number of iterations to convergence: 4
Achieved convergence tolerance: 2.625e-06
Also this model isn't too bad. It uses only two parameters, it is linear in them and it has a residual sum of squares of 1.728837 compared to 1.684 for the fm2 model and 1.685 for the fm4 model. See green line on graph above.
fm5 <- lm(y ~ log(t))
> fm5
Call:
lm(formula = y ~ log(t))
Coefficients:
(Intercept) log(t)
2.4029 -0.1793
> deviance(fm5)
[1] 1.728837
y <- c(2, 3, 2.333333, 1.75, 2, 1.833333, 2.416667, 1.916667, 1.75,
2.166667, 2.116667, 1.916667, 1.944444, 1.611111, 1.722222, 1.777778,
1.877778, 1.944444, 1.958333, 1.833333, 2.041667, 2.020833, 1.908333,
1.916667, 1.733333, 1.833333, 1.8, 1.933333, 1.893333, 1.866667,
1.888889, 1.805556, 1.833333, 1.847222, 1.822222, 1.805556, 1.833333,
1.904762, 1.880952, 1.833333, 1.804762, 1.809524, 1.708333, 1.708333,
1.75, 1.708333, 1.683333, 1.6875, 1.611111, 1.666667, 1.648148,
1.611111, 1.611111, 1.611111, 1.65, 1.6, 1.65, 1.625, 1.63, 1.616667,
1.469697, 1.560606, 1.590909, 1.651515, 1.651515, 1.651515, 1.513889,
1.555556, 1.625, 1.638889, 1.647222, 1.652778, 1.679487, 1.717949,
1.705128, 1.698718)
t <- seq_along(y)