I should have successfully used mice
to do multiple imputation on a data frame. I would now like to run glm
on that dataset. My outcome variable is "MI" and my independent variables are "Hypertension" and "Diabetes".
I have tried:
dat <- mice(regression)
model <- which(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial"))
But I get the following error:
Error in which(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial")):
argument to 'which' is not logical.
Does anybody know why this is?
I think you are getting an error because you are using which()
instead of with()
. which()
is a function that ask (in layperson's term), "Which of these values are true?" You have to specify something that can be true or false.
with()
is a function that's like, "With this dataset, evaluate something something inside it." You have to provide some kind of data environment (e.g., a list, a data frame), and use vectors that are inside without needing to naming that data environment again.
with()
can be used with the mice package like this:
# example data frame
set.seed(123)
df <- data.frame(
MI = factor(rep(c(0,1),5)),
Hypertension = c(rnorm(9), NA),
Diabetes = c(NA, rnorm(9)))
# imputation
library(mice)
dat <- mice(df)
with(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial"))
with(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial"))
shows you the glm() outputs for each imputation in dat
. mice()
does five imputations by default.
An alternative with glm.mids()
Why doesn't glm(MI ~ Hypertension + Diabetes, family = "binomial", data = dat)
work? It gives an error, "Error cannot coerce class ‘mids’ to a data.frame" because the imputed dat
is not a data frame.
Instead, mice has a function for running glm() with multivariate imputed data ("mids"), glm.mids()
:
#glm(MI ~ Hypertension + Diabetes, family = "binomial", data = dat) # it does not work
glm.mids(MI ~ Hypertension + Diabetes, family = "binomial", data = dat) # it works
with(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial")) # does the same thing
Edit Note
When you use with()
while you are using the mice package, I think it actually calls with()
from mice package's "with.mids", which allows you to use with()
with mice package's "mids" data class. It supersedes glm.mids()
. See here for details: https://rdrr.io/cran/mice/man/with.mids.html