I am writing a multilevel regression model, in which I begin the second level with a dataframe of predictands (coefficients from the first level) and a dataframe of predictors. Both dataframes have the same number of observations.
I wish to loop over the preditands (columns in first dataframe) and use lm()
to regress them against the entire second dataframe of predictors. However, when I do, I get an error that I cannot figure out.
Example:
data(iris)
iris1 <- iris[-5] # remove the categories
iris2 <- iris[-5] * 6
for (col in names(iris1)) {
lm(iris1[col] ~ iris2)
}
## Error in model.frame.default(formula = iris1[col] ~ iris2, drop.unused.levels = TRUE) :
## invalid type (list) for variable 'iris1[col]'
I just can't understand what this means or why R considers iris1[col]
to be a list
.
For simplicity's sake I've tried merging them:
for (col in names(iris1)) {
tmp_df <- cbind(iris1[col], iris2)
colnames(tmp_df) <- letters[1:5] # to avoid duplicate names
lm(1 ~ ., tmp_df)
}
## Error in model.frame.default(formula = 1 ~ ., data = tmp_df, drop.unused.levels = TRUE) :
## variable lengths differ (found for 'a')
And this one's particularly frustrating because they are clearly the same length.
Note that lm can accept a matrix on the left hand side of the formula so we could do this:
lm(as.matrix(iris1) ~., iris2)
or if we want a separate lm object for each column of iris1:
regr <- function(y) lm(y ~., iris2))
Map(regr, iris1)
or
regr2 <- function(nm) {
fo <- as.formula(sprintf("iris2$%s ~.", nm))
do.call("lm", list(fo, quote(iris2)))
}
Map(regr2, names(iris1))
or lm.fit
:
regr.fit <- function(y) lm.fit(cbind(1, as.matrix(iris2)), y)
Map(regr.fit, iris1)
Note that the component names of the result will be the y
column name in iris1.