I have been trying to estimate multiple ANOVA's at the same time with a loop. But I want to loop through both multiple predictors and multiple outcomes. So I have been trying to do a nested loop.
#data
test<-structure(list(Alcohol = c(1L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 0L,
0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L), Smoker = c(0,
0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1
), CXMP = c(1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1,
1, 0, 1, 1, 0), CXDIAG = c(1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0,
0, 1, 1, 1, 1, 1, 1, 0, 0, 1), Treatment = c(2, 2, 1, 2, 1, 0,
2, 0, 0, 0, 2, 2, 0, 2, 0, 0, 2, 2, 1, 1, 2, 1), metformin_base = c(1L,
1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 1L,
1L, 0L, 0L, 1L, 0L), BMI = c(38.17, 34.14, 39.55, 49.68, 41.44,
43.23, 41.65, 53.11, 45.04, 46.78, 52.42, 51.36, 60.7, 48.36,
53.31, 43.29, 57.44, 53.44, 40.54, 41.2, 55.36, 33.95), Waist = c(120,
118.5, 129.5, 144, 133.7, 121, 118.7, 139, 120.1, 131.5, 121.5,
115, 160, 154.1, 147, 128, 134, 132.5, 118, 129, NA, NA), age = c(74.52977413,
38.02327173, 41.08966461, 63.80013689, 22.12457221, 61.06502396,
61.55509925, 32.47638604, 65.60438056, 68.6899384, 55.86584531,
39.52908967, 55.69883641, 57.83709788, 52.98288843, 32.678987,
63.43052704, 51.29637235, 52.11225188, 67.9945243, 66.7926078,
38.80903491), charleston = c(5L, 0L, 0L, 2L, 0L, 3L, 2L, 0L,
3L, 2L, 1L, 0L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 3L, 0L), FOOD_Fruit = c(1,
1.5, 1, 1, NA, 1, 2, 1, NA, 2, 0, 0, 2.5, 2, 2, 2, 3.5, 2, 2,
3, 3, 2), FOOD_Vegetable = c(3, 3.5, 2, 2, NA, 1, 1, 2, 2, 3,
2, 0, 3, 3, 3, 3, 1.5, 2.5, 2.5, 2, 5, 5), exercisemin = c(0L,
30L, 20L, 0L, NA, 0L, NA, 85L, NA, 0L, 0L, NA, 0L, 80L, 30L,
10L, 60L, 0L, 0L, 0L, 15L, 60L)), row.names = c(NA, 22L), class = "data.frame")
#data transformations
catvars<-subset(test,
select=c(Alcohol,Smoker,CXMP,CXDIAG,Treatment,metformin_base)) #creating new
subset of categorical variables that does not include Charlson or BMIfactor
catvars <- catvars %>%
mutate(across(everything(catvars), factor)) #converting the subset of categorical
variables into factors
contvars<-subset(test, select=c(BMI,Waist,age,charleston,
FOOD_Fruit,FOOD_Vegetable,exercisemin)) #creating subset of continous variables
contvars <- as.data.frame(lapply(contvars, as.numeric))
I have tried all sorts of things- running the loop with the predictors in the same dataframe, running the loop with and without paste0, running the loop with and without as.formula, running the loop with different types of loop functions, running it with different types of anova functions, etc. For the most part, My plan was to run it as a linear model, and then get the summary of results of anova.
#linear model
anovas<-for(i in colnames(contvars)) {
for(j in colnames(catvars)) {
lm(as.formula(paste0(i , "~" , j)),
data=cbind(contvars,catvars))
}
}
#What I plan to use to get the summary once the loop works:
summary(aov(anovas))
The loop is what I get stuck on. No matter what I do, it throws an error. And it has thrown many types of errors- extremely large variety. I am not sure what I am doing wrong. With this syntax, the object shows up as "NULL"
There are a few issues here.
cbind()
may be unsafe, data.frame()
is saferpaste(i,j,sep=".")
as the name; I originally did this with a k
index that I incremented at each step.combdata <- data.frame(contvars,catvars)
res <- list()
for(i in colnames(contvars)) {
for(j in colnames(catvars)) {
res[[paste(i,j,sep=".")]] <- lm(as.formula(paste0(i , "~" , j)),
combdata)
}
}
You could use something like sapply(res, function(x) summary(x)$r.squared)
to summarize the results.