Search code examples
rglmimputationr-mice

Multiple imputation (mice) with glm: "argument to 'which' is not logical"


I should have successfully used mice to do multiple imputation on a data frame. I would now like to run glm on that dataset. My outcome variable is "MI" and my independent variables are "Hypertension" and "Diabetes". I have tried:

dat <- mice(regression)
model <- which(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial"))

But I get the following error:

Error in which(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial")):
argument to 'which' is not logical.

Does anybody know why this is?


Solution

  • I think you are getting an error because you are using which() instead of with(). which() is a function that ask (in layperson's term), "Which of these values are true?" You have to specify something that can be true or false.

    with() is a function that's like, "With this dataset, evaluate something something inside it." You have to provide some kind of data environment (e.g., a list, a data frame), and use vectors that are inside without needing to naming that data environment again.

    with() can be used with the mice package like this:

    # example data frame
    set.seed(123)
    df <- data.frame(
        MI = factor(rep(c(0,1),5)), 
        Hypertension = c(rnorm(9), NA), 
        Diabetes = c(NA, rnorm(9)))
    
    # imputation
    library(mice)
    
    dat <- mice(df)
    
    with(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial"))
    

    with(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial")) shows you the glm() outputs for each imputation in dat. mice() does five imputations by default.

    An alternative with glm.mids()

    Why doesn't glm(MI ~ Hypertension + Diabetes, family = "binomial", data = dat) work? It gives an error, "Error cannot coerce class ‘mids’ to a data.frame" because the imputed dat is not a data frame.

    Instead, mice has a function for running glm() with multivariate imputed data ("mids"), glm.mids():

    #glm(MI ~ Hypertension + Diabetes, family = "binomial", data = dat) # it does not work
    
    glm.mids(MI ~ Hypertension + Diabetes, family = "binomial", data = dat) # it works
    
    with(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial")) # does the same thing
    

    Edit Note

    When you use with() while you are using the mice package, I think it actually calls with() from mice package's "with.mids", which allows you to use with() with mice package's "mids" data class. It supersedes glm.mids(). See here for details: https://rdrr.io/cran/mice/man/with.mids.html