My GLM models works fine, but in order to speed things up I want to include the variable names in a loop and make R calculate the GLM for each variable. So I tried this:
varlist <- c("age", "doc")
for (i in 1:length(varlist))
{
glmmodel <- glm(formula = Kommunikation ~ varlist[i], family = binomial, data = analysis_data)
univars[i,1] <- names(coef(glmmodel))[2]
univars[i,2] <- exp(confint.default(glmmodel)[2,1])
univars[i,3] <- exp(glmmodel$coefficients[2])
univars[i,4] <- exp(confint.default(glmmodel)[2,2])
}
Unfortunately, this results in the error:
Error in model.frame.default(formula = Kommunikation ~ varlist[[i]], data = analysis_data, :
variable lengths differ (found for 'varlist[[i]]')
As the GLM model works when I substitute varlist[i]
with the respective variable names age
and doc
, I would guess the issue stems from how the characters are read by R when substituting the variable name? (I have 26 variables but posting two only for convenience.)
The problem is that you're trying to drop a string into your formula in your call to glm()
. Here's one possible solution in which you pass the string to a call to as.formula
and then use that formula in the model instead.
df <- data.frame(y = rbinom(10, 1, 0.5), x1 = rnorm(10), x2 = rnorm(10))
varlist <- c("x1", "x2")
univars <- data.frame() # create an empty data frame so the rest of your code works
for (i in seq_along(varlist))
{
mod <- as.formula(sprintf("y ~ %s", varlist[i]))
glmmodel <- glm(formula = mod, family = binomial, data = df)
univars[i,1] <- names(coef(glmmodel))[2]
univars[i,2] <- exp(confint.default(glmmodel)[2,1])
univars[i,3] <- exp(glmmodel$coefficients[2])
univars[i,4] <- exp(confint.default(glmmodel)[2,2])
}
Result:
> univars
V1 V2 V3 V4
1 x1 0.4728192 3.1185658 20.569074
2 x2 0.1665581 0.7241709 3.148592
Also, I would be inclined to do this with lapply
instead of a for
loop, making rows that you can then bind. But this works.