Suppose I have following df.
ind1 <- rnorm(100)
ind2 <- rnorm(100)
ind3 <- rnorm(100)
ind4 <- rnorm(100)
ind5 <- rnorm(100)
dep <- rnorm(100, mean=ind1)
df <- data.frame(dep,ind1, ind2, ind3, ind4, ind5)
I calculated 3rd order polynomial regression between dep variable and each of the ind variables separately using lapply() as follow:
polys <- lapply(df[,-1], function(x) summary(lm(dep ~ poly(x, 3, raw =
TRUE), data = df)))
Now I want to list/order best models based on AIC. I tried these without success.
stepAIC(polys)
polys$AIC
Do you have any idea how I should do this in R?
Thanks in advance!
Currently, you are conflating list objects with methods stepAIC
and AIC
that run on specific components. Specifically,
The function stepAIC
in MASS
package runs on individual model objects (e.g., lm
class types), not summary of model objects and not list of model objects (i.e., polys) with your failed attempt: stepAIC(polys)
.
The function AIC
in MASS
package runs on individual model objects usually after the stepAIC
call and is not an item of the return with your failed attempt: polys$AIC
.
Consider the following re-factoring of your code with seeded data for reproducibility to sort list of lm objects, polys, by ascending AIC values:
library(MASS)
set.seed(50619)
#... same data build code ...
# LIST OF lm OBJECTS (NO SUMMARY)
polys <- lapply(df[,-1], function(x) lm(dep ~ poly(x, 3, raw = TRUE), data = df))
# SORTED VECTOR OF AIC VALUES (USING sapply)
AIC_vec <- sort(sapply(polys, function(x) AIC(stepAIC(x))))
AIC_vec
# ind1 ind3 ind2 ind4 ind5
# 297.1865 352.3694 352.8260 352.8260 352.8260
# SORTED polys BY AIC (BEST TO WORST)
sorted_polys <- polys[names(AIC_vec)]