The following dataset is available
data <- structure(list(asdas_6month = c(23.1222666868239, 25.4056847196073,
25.9886630231065, NA, 26.9450864282904, 15.1832953552198, 22.1618055512694,
NA, 24.1387146612986, 25.598233740795, 22.6844495409994, 25.0138310842063,
20.9944595011522, 17.0762423377328, NA, NA, 20.2359010676347,
17.5468970969989, 22.9765676870538, 26.3032333127368, NA, NA,
NA, 17.3203951667699, 19.126959104744), gender = structure(c(1L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L), .Label = c("Female", "Male"), class = "factor"),
age = c(47.9379517873091, 46.837373193357, 48.5646295793097,
43.1378807456583, 60.3619393447192, 70.1290549397305, 84.3587981654008,
59.2292347942614, 41.7327157246053, 52.0137845399698, 55.0951441078166,
71.6184307122057, 43.3101374804154, 33.5854501557607, 51.9032470737109,
68.1204996602706, 42.9427562299075, 55.909031412815, 29.895500127283,
20.9296411673894, 29.3957377286062, 46.974102661638, 54.6740110130539,
42.6997039072135, 67.3413773507263), asdas_baseline = c(63.7251494911822,
NA, 65.0638161875852, 70.1816100941605, 53.1972327260365,
62.980030777934, 60.3085321252511, 58.9998256902073, 56.8045598820947,
54.4446059090559, NA, 61.7293600038226, 56.5674724119214,
62.8593507709476, NA, 54.9028311743253, NA, NA, 67.6467591815449,
58.5134614505046, 59.3735346553234, 51.9158516755166, 63.0645651881476,
58.7759004270177, 55.0687922895208)), class = "data.frame", row.names = c(NA,
-25L))
Here is what it appears like:
'data.frame': 25 obs. of 4 variables:
$ asdas_6month : num 23.1 25.4 26 NA 26.9 ...
$ gender : Factor w/ 2 levels "Female","Male": 1 2 1 1 1 1 2 2 2 1 ...
$ age : num 47.9 46.8 48.6 43.1 60.4 ...
$ asdas_baseline: num 63.7 NA 65.1 70.2 53.2 ...
Using the following code, I can generate a mids
object from the mice
library and create five imputed datasets
library(mice)
new_imp <- mice(data, m=5, maxit=10, print = FALSE, seed = 449)
print(new_imp)
Number of multiple imputations: 5
Imputation methods:
asdas_6month gender age asdas_baseline
"pmm" "" "" "pmm"
PredictorMatrix:
asdas_6month gender age asdas_baseline
asdas_6month 0 1 1 1
gender 1 0 1 1
age 1 1 0 1
asdas_baseline 1 1 1 0
It is my aim to recode a new variable and create asdas_improvement
, which is >30 improvement
or more in ASDAS score at 6 months. Normally I can calculate this with the mutate function of dplyr
as following:
library(dplyr)
data %>%
mutate(asdas_improvement = if_else(asdas_baseline - asdas_6month >= 40, 1, 0))
How does one recode a similar variable inside the mids
object?
To do a calculation on the imputed datasets, we can use complete
to get a dataframe of the imputed data. Then, we can use mutate
as normal to make the calculation. Then, you can use as.mids
to turn it back into a mids
object.
library(dplyr)
full.impdata <- complete(new_imp, 'long', include = TRUE) %>%
mutate(asdas_improvement = if_else(asdas_baseline - asdas_6month >= 40, 1, 0))
new_imp <- as.mids(full.impdata)
Output
str(new_imp$imp$asdas_improvement)
'data.frame': 11 obs. of 5 variables:
$ 1: num 0 1 0 0 1 0 0 0 1 0 ...
$ 2: num 0 1 0 0 0 0 0 1 0 0 ...
$ 3: num 0 1 0 0 0 0 1 1 0 0 ...
$ 4: num 0 1 1 0 0 0 0 1 0 0 ...
$ 5: num 0 1 0 0 0 0 0 0 1 0 ...