I want to do VIF testing running consecutive regressions within a dataset, each time using one variable as the response and the remaining as predictors.
To that end I will put my code within a for loop which will give consecutive values to the index of the column that will be used as the response and leave the remaining as predictors.
I am going to use the data.table package and I will use the mtcars dataset found in base R to create a reproducible example:
data(mtcars)
setDT(mtcars)
# Let i-- the index of the response -- be 1 for demonstration purposes
i <- 1
variables <- names(mtcars)
response <- names(mtcars)[i]
predictors <- setdiff(variables, response)
model <- glm(mtcars[, get(response)] ~ mtcars[, predictors , with = FALSE], family = "gaussian")
However, this results to an error message:
Error in model.frame.default(formula = mtcars[, get(response)] ~ mtcars[, : invalid type (list) for variable 'mtcars[, predictors, with = FALSE]'
Could you explain the error and help me correct the code?
Your advice will be appreciated.
=============================================================================
In reproducing the code suggested I got an error message:
> library(car)
> library(data.table)
>
> data(mtcars)
> setDT(mtcars)
> model <- glm(formula = mpg ~ .,data=mtcars , family = "gaussian")
> vif(model)
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘vif’ for signature ‘"glm"’
Update:
The code run without problem when I specified explicitly the package, i.e.:
car::vif(model)
I had to amend Fredrik's code as follows to get the coefficients of all the variables:
rhs <- paste(predictors, collapse ="+")
full_formula <- paste(response, "~", rhs)
full_formula <- as.formula(full_formula)
Another solution is based on the use of glm.fit
:
model <- glm.fit(x=mtcars[, ..predictors], y=mtcars[[response]], family = gaussian())