mice: splitting imputed data for further analysis

I am using the mice package (version 3.3.0) to perform multiple imputations (MI). The MI procedure works fine. For further analysis I would like to separate/split/subset imputed data by the variable ‘group’ like indicated in the example below.

library(mice)

d <- nhanes
d$group <- as.factor(c(rep("A", 13), rep("B", 12)))
str(d)

imp <- mice(d)

fit <- with(imp, lm(bmi ~ age + chl + group))
est <- pool(fit)
summary(est, digits=3)

# I would like to do is
imp.A <- imp[which(group=="A")]
imp.B <- imp[which(group=="B")]

fit.A <- with(imp.A, lm(bmi ~ age + chl))
fit.B <- with(imp.A, lm(bmi ~ age + chl))

Is it possible to split imputed data somehow?

Solution

I think this code can be used to achieve what you are asking for

First create a long format version of all your datasets:

d.long <- mice::complete(imp,"long",include = T)

Next perform your grouping as normal using base R

d.long.A <- d.long[which(d.long$group == 'A'),]
d.long.B <- d.long[which(d.long$group == 'B'),]

Then change these back to mids objects, so you can perform mice operations

imp.A <- as.mids(d.long.A)
imp.B <- as.mids(d.long.B)

You'll probably get a warning message because group is now a constant.

Warning message:
Number of logged events: 1
imp.A$loggedEvents
  it im dep     meth      out
1  0  0     constant group

But this shouldn't be a problem, it's just mice telling you there is a constant value in your dataset. Finally you can use your new subsets for your regression models

fit.A <- with(imp.A, lm(bmi ~ age + chl))
fit.B <- with(imp.B, lm(bmi ~ age + chl))

use pool to get the pooled results. I'm not entirely sure why you want to do this instead of just including the group variable in your regression model, but I assume you have a reason for this. Hope this helps!