Search code examples
rimputationrecoder-mice

Recode a variable within the mids object of the mice library in R


The following dataset is available

data <- structure(list(asdas_6month = c(23.1222666868239, 25.4056847196073, 
25.9886630231065, NA, 26.9450864282904, 15.1832953552198, 22.1618055512694, 
NA, 24.1387146612986, 25.598233740795, 22.6844495409994, 25.0138310842063, 
20.9944595011522, 17.0762423377328, NA, NA, 20.2359010676347, 
17.5468970969989, 22.9765676870538, 26.3032333127368, NA, NA, 
NA, 17.3203951667699, 19.126959104744), gender = structure(c(1L, 
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 
2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L), .Label = c("Female", "Male"), class = "factor"), 
    age = c(47.9379517873091, 46.837373193357, 48.5646295793097, 
    43.1378807456583, 60.3619393447192, 70.1290549397305, 84.3587981654008, 
    59.2292347942614, 41.7327157246053, 52.0137845399698, 55.0951441078166, 
    71.6184307122057, 43.3101374804154, 33.5854501557607, 51.9032470737109, 
    68.1204996602706, 42.9427562299075, 55.909031412815, 29.895500127283, 
    20.9296411673894, 29.3957377286062, 46.974102661638, 54.6740110130539, 
    42.6997039072135, 67.3413773507263), asdas_baseline = c(63.7251494911822, 
    NA, 65.0638161875852, 70.1816100941605, 53.1972327260365, 
    62.980030777934, 60.3085321252511, 58.9998256902073, 56.8045598820947, 
    54.4446059090559, NA, 61.7293600038226, 56.5674724119214, 
    62.8593507709476, NA, 54.9028311743253, NA, NA, 67.6467591815449, 
    58.5134614505046, 59.3735346553234, 51.9158516755166, 63.0645651881476, 
    58.7759004270177, 55.0687922895208)), class = "data.frame", row.names = c(NA, 
-25L))

Here is what it appears like:

'data.frame':   25 obs. of  4 variables:
 $ asdas_6month  : num  23.1 25.4 26 NA 26.9 ...
 $ gender        : Factor w/ 2 levels "Female","Male": 1 2 1 1 1 1 2 2 2 1 ...
 $ age           : num  47.9 46.8 48.6 43.1 60.4 ...
 $ asdas_baseline: num  63.7 NA 65.1 70.2 53.2 ...

Using the following code, I can generate a mids object from the mice library and create five imputed datasets

library(mice)    
new_imp <- mice(data, m=5, maxit=10, print = FALSE, seed = 449)
print(new_imp)

Number of multiple imputations:  5 
Imputation methods:
  asdas_6month         gender            age asdas_baseline 
         "pmm"             ""             ""          "pmm" 
PredictorMatrix:
               asdas_6month gender age asdas_baseline
asdas_6month              0      1   1              1
gender                    1      0   1              1
age                       1      1   0              1
asdas_baseline            1      1   1              0

It is my aim to recode a new variable and create asdas_improvement, which is >30 improvement or more in ASDAS score at 6 months. Normally I can calculate this with the mutate function of dplyr as following:

library(dplyr)
data %>%
  mutate(asdas_improvement = if_else(asdas_baseline - asdas_6month >= 40, 1, 0))

How does one recode a similar variable inside the mids object?


Solution

  • To do a calculation on the imputed datasets, we can use complete to get a dataframe of the imputed data. Then, we can use mutate as normal to make the calculation. Then, you can use as.mids to turn it back into a mids object.

    library(dplyr)
    
    full.impdata <- complete(new_imp, 'long', include = TRUE) %>%
      mutate(asdas_improvement = if_else(asdas_baseline - asdas_6month >= 40, 1, 0))
    
    new_imp <- as.mids(full.impdata)
    

    Output

    str(new_imp$imp$asdas_improvement)
    
    'data.frame':   11 obs. of  5 variables:
     $ 1: num  0 1 0 0 1 0 0 0 1 0 ...
     $ 2: num  0 1 0 0 0 0 0 1 0 0 ...
     $ 3: num  0 1 0 0 0 0 1 1 0 0 ...
     $ 4: num  0 1 1 0 0 0 0 1 0 0 ...
     $ 5: num  0 1 0 0 0 0 0 0 1 0 ...