I am trying to run a non-linear regression on a dataset whereby I would like to run a new regression for each group. The data frame is much like this one:
Date <- as.POSIXct(c("2021-05-25","2021-05-20", "2021-05-21","2021-05-22",
"2021-05-23","2021-05-24" ,"2021-05-25","2021-05-20", "2021-05-21","2021-05-22",
"2021-05-23","2021-05-24" ,"2021-05-25","2021-05-20", "2021-05-21","2021-05-22",
"2021-05-23","2021-05-24" ,"2021-05-25","2021-05-20", "2021-05-21","2021-05-22",
"2021-05-23","2021-05-24" ,"2021-05-25"))
Ts <- rnorm(25, mean=10, sd=0.5)
Exp_flux <- 3.5*exp((Ts-10)/10)
Collar <- as.factor(c("t1","t2","t3","t4","t5","t1","t2","t3","t4","t5","t1","t2","t3","t4",
"t5","t1","t2","t3","t4","t5","t1","t2","t3","t4","t5"))
df <- data.frame(Date,Collar,Ts,Exp_flux)
df
Date Collar Ts Exp_flux
1 2021-05-25 t1 9.596453 3.361570
2 2021-05-20 t2 8.870983 3.126334
3 2021-05-21 t3 10.011902 3.504168
4 2021-05-22 t4 10.480873 3.672418
5 2021-05-23 t5 10.264998 3.593989
6 2021-05-24 t1 10.196256 3.569368
7 2021-05-25 t2 9.523135 3.337014
8 2021-05-20 t3 10.315953 3.612349
9 2021-05-21 t4 9.510503 3.332801
10 2021-05-22 t5 10.300981 3.606945
11 2021-05-23 t1 10.788605 3.787187
12 2021-05-24 t2 10.226902 3.580323
13 2021-05-25 t3 9.005530 3.168683
14 2021-05-20 t4 10.752006 3.773351
15 2021-05-21 t5 9.335704 3.275051
16 2021-05-22 t1 9.345418 3.278234
17 2021-05-23 t2 10.034693 3.512164
18 2021-05-24 t3 10.754786 3.774401
19 2021-05-25 t4 9.655313 3.381415
20 2021-05-20 t5 10.670903 3.742872
21 2021-05-21 t1 8.986950 3.162801
22 2021-05-22 t2 10.441217 3.657883
23 2021-05-23 t3 10.446326 3.659753
24 2021-05-24 t4 10.550104 3.697931
25 2021-05-25 t5 10.442247 3.658260
My aim here is to run a separate regression of Exp_flux vs Ts for each collar type. I know I could separate the main dataset into subsets for each collar and perform each regression manually but in reality there are more than 20 collar types and I figured there must be a more efficient way to do this. I have tried using the nlsList
function of the nlme
package, which just gives an empty list or (in previous cases) the regression of only the first collar:
fit.collars <- nlsList(Exp_Flux ~ SRref*q^((Ts-10)/10)| Collar,
data=df, start=list(SRref=3, q=2), na.action = na.omit )
summary(fit.collars)
Error in class(val) <- c("summary.nlsList", class(val)) :
attempt to set an attribute on NULL
I must be using the nlsList function incorrectly but I can't figure out how so. Tutorials on this function are pretty scarce online. Can anyone advise on this or a relatively simple alternative?
Let me quote help("nls")
:
The default settings of nls generally fail on artificial “zero-residual” data problems.
If I add some white noise and fix the typo, I get successful fits.
set.seed(42)
Date <- as.POSIXct(c("2021-05-25","2021-05-20", "2021-05-21","2021-05-22",
"2021-05-23","2021-05-24" ,"2021-05-25","2021-05-20", "2021-05-21","2021-05-22",
"2021-05-23","2021-05-24" ,"2021-05-25","2021-05-20", "2021-05-21","2021-05-22",
"2021-05-23","2021-05-24" ,"2021-05-25","2021-05-20", "2021-05-21","2021-05-22",
"2021-05-23","2021-05-24" ,"2021-05-25"))
Ts <- rnorm(25, mean=10, sd=0.5)
Exp_flux <- 3.5*exp((Ts-10)/10) + rnorm(25, sd = 0.01)
Collar <- as.factor(c("t1","t2","t3","t4","t5","t1","t2","t3","t4","t5","t1","t2","t3","t4",
"t5","t1","t2","t3","t4","t5","t1","t2","t3","t4","t5"))
df <- data.frame(Date,Collar,Ts,Exp_flux)
library(nlme)
fit.collars <- nlsList(Exp_flux ~ SRref*q^((Ts-10)/10)| Collar,
data=df, start=list(SRref=3, q=2), na.action = na.omit )
summary(fit.collars)
#works
Please consider carefully if you really want a pooled residual standard error.