Search code examples
rmissing-dataimputationmulti-levelr-mice

Imputation with mice - exclude variable from being imputed but still use as predictor


So I tried to impute some missing data and there is a problem. I want three variables to be used as predictors but I don't want them to be imputed. Even though I specified the predictormatrix as the following:

# Initialize
m0 = mice(datimp, maxit = 0)

predmat = m0$predictorMatrix
m0$predictorMatrix
m0$method

meth = m0$method
meth["cm.dep.0"] = "2l.pan"
meth["cm.dep.1"] = "2l.pan"
meth["cm.dep.2"] = "2l.pan"


# Group is added as a fixed effect to all variables
predmat[,"trial"] = -2
predmat[,"group"] = 1
predmat["sess",]= 0
predmat["depmed",]= 0
predmat["prevpsychoth",]= 0
predmat

             trial group sex age sess degree rel child depmed prevpsychoth cm.dep.0 cm.dep.1
trial           -2     1   1   1    1      1   1     1      1            1        1        1
group           -2     1   1   1    1      1   1     1      1            1        1        1
sex             -2     1   0   1    1      1   1     1      1            1        1        1
age             -2     1   1   0    1      1   1     1      1            1        1        1
sess             0     0   0   0    0      0   0     0      0            0        0        0
degree          -2     1   1   1    1      0   1     1      1            1        1        1
rel             -2     1   1   1    1      1   0     1      1            1        1        1
child           -2     1   1   1    1      1   1     0      1            1        1        1
depmed           0     0   0   0    0      0   0     0      0            0        0        0
prevpsychoth     0     0   0   0    0      0   0     0      0            0        0        0
cm.dep.0        -2     1   1   1    1      1   1     1      1            1        0        1
cm.dep.1        -2     1   1   1    1      1   1     1      1            1        1        0
cm.dep.2        -2     1   1   1    1      1   1     1      1            1        1        1
             cm.dep.2
trial               1
group               1
sex                 1
age                 1
sess                0
degree              1
rel                 1
child               1
depmed              0
prevpsychoth        0
cm.dep.0            1
cm.dep.1            1
cm.dep.2            0

In the end sess, depmed and prevpsychoth are being imputed. Any ideas why this happens?


Solution

  • Setting the column rather than the row to zero as well as emptying the method of not-to-be-imputed variables should work. Example with nhanes data set from mice

    library(mice)
    m0 <- mice(nhanes, maxit=0)
    
    meth <- m0$method
    meth[names(meth) %in% c("bmi")] <- ""
    pred <- m0$predictorMatrix
    pred[, colnames(pred) %in% c("bmi")] <- 0   
    
    imp <- mice(nhanes, predictorMatrix=pred, method=meth)
    imp$imp
    complete(imp, "long")[1:25, ]
    
    imp <- mice(nhanes, predictorMatrix=pred, method=m0$method)
    

    Result

    complete(imp, "long")[1:25, ]
    #    .imp .id age  bmi hyp chl
    # 1     1   1   1   NA   1 238
    # 2     1   2   2 22.7   1 187
    # 3     1   3   1   NA   1 187
    # 4     1   4   3   NA   1 206
    # 5     1   5   1 20.4   1 113
    # 6     1   6   3   NA   1 184
    # 7     1   7   1 22.5   1 118
    # 8     1   8   1 30.1   1 187
    # 9     1   9   2 22.0   1 238
    # 10    1  10   2   NA   1 186
    # 11    1  11   1   NA   1 238
    # 12    1  12   2   NA   1 186
    # 13    1  13   3 21.7   1 206
    # 14    1  14   2 28.7   2 204
    # 15    1  15   1 29.6   1 238
    # 16    1  16   1   NA   1 238
    # 17    1  17   3 27.2   2 284
    # 18    1  18   2 26.3   2 199
    # 19    1  19   1 35.3   1 218
    # 20    1  20   3 25.5   2 199
    # 21    1  21   1   NA   1 187
    # 22    1  22   1 33.2   1 229
    # 23    1  23   1 27.5   1 131
    # 24    1  24   3 24.9   1 186
    # 25    1  25   2 27.4   1 186