I would like to analyse many x variables (400 variables) against one y variable (1 variable). However I do not want to write for each and every x variable a new model. Is it possible to write one model which than checks all x variables with y in R-Studio?
If your data frame is DF,
regs <- list()
for (v in setdiff(names(DF), "y")) {
fm <- eval(parse(text = sprintf("y ~ %s", v)))
regs[[v]] <- lm(fm, data=DF)
}
Now you have all simple regression results in the regs
list.
Example:
## Generate data
n <- 1000
set.seed(1)
DF <- data.frame(y = rnorm(n))
for (j in seq(400)) DF[[paste0('x',j)]] <- rnorm(n)
## Now data ready
dim(DF)
# [1] 1000 401
head(names(DF))
# [1] "y" "x1" "x2" "x3" "x4" "x5"
tail(names(DF))
# [1] "x395" "x396" "x397" "x398" "x399" "x400"
regs <- list()
for (v in setdiff(names(DF), "y")) {
fm <- eval(parse(text = sprintf("y ~ %s", v)))
regs[[v]] <- lm(fm, data=DF)
}
head(names(regs))
# [1] "x1" "x2" "x3" "x4" "x5" "x6"
r2s <- sapply(regs, function(x) summary(x)$r.squared)
head(r2s, 3)
# x1 x2 x3
# 0.0000409755 0.0024376111 0.0005509134