Search code examples
rr-micepsychsjplot

tab_corr, tab_df, and psych::describe on a mice mids object


I have imputed data saved as a mids object and am trying to adapt my usual workflow around imputed data. However, I cannot figure out how to use sjPlot's tab_corr() and tab_df() and psych's describe on a mids object.

My goal is to generate a table of descriptive statistics and a correlation matrix without averaging the imputed datasets together. I was able to generate correlations using miceadds::micombine.cor, but the output isn't formatted like a typical correlation matrix. I also can individually compute means, SDs, etc. of variables from the mids object, but I'm looking for something that will generate a table.

library(mice)
library(miceadds)
library(sjPlot)
library(tidyverse)
library(psych)
set.seed(123)

## correlation matrix

data(nhanes)
imp <- mice(nhanes, print = FALSE)
head(micombine.cor(mi.res = imp)) # ugly
#>   variable1 variable2           r       rse    fisher_r fisher_rse        fmi
#> 1       age       bmi -0.38765907 0.1899398 -0.40904214  0.2234456 0.09322905
#> 2       age       hyp  0.51588273 0.1792162  0.57071301  0.2443348 0.25939786
#> 3       age       chl  0.37685482 0.2157535  0.39638877  0.2515615 0.30863126
#> 4       bmi       hyp -0.01748158 0.2244419 -0.01748336  0.2245067 0.10249784
#> 5       bmi       chl  0.29082393 0.2519295  0.29946608  0.2752862 0.44307791
#> 6       hyp       chl  0.30271060 0.1984525  0.31250096  0.2185381 0.04935528
#>             t          p     lower95   upper95
#> 1 -1.83061192 0.06715849 -0.68949235 0.0288951
#> 2  2.33578315 0.01950255  0.09156846 0.7816509
#> 3  1.57571320 0.11509191 -0.09636276 0.7111171
#> 4 -0.07787455 0.93792784 -0.42805131 0.3990695
#> 5  1.08783556 0.27666771 -0.23557593 0.6852881
#> 6  1.42996130 0.15272813 -0.11531056 0.6296450

data(iris)
iris %>% 
  select(-c(Species)) %>% 
  tab_corr() # pretty
    Sepal.Length    Sepal.Width Petal.Length    Petal.Width
Sepal.Length        -0.118  0.872\*\*\* 0.818\*\*\*
Sepal.Width -0.118      -0.428\*\*\*    -0.366\*\*\*
Petal.Length    0.872\*\*\* -0.428\*\*\*        0.963\*\*\*
Petal.Width 0.818\*\*\* -0.366\*\*\*    0.963\*\*\*  
Computed correlation used pearson-method with listwise-deletion.
## descriptive statistics

psych::describe(imp) # error
#> Warning in mean.default(x, na.rm = na.rm): argument is not numeric or logical:
#> returning NA
#> Error in is.data.frame(x): 'list' object cannot be coerced to type 'double'
mean(imp$data$age) # inefficient
#> [1] 1.76

iris %>% 
  select(-c(Species)) %>% 
  psych::describe() %>% 
  select(-(c(vars, n, median, trimmed, mad))) %>%
  tab_df() # pretty
mean    sd  min max range   skew    kurtosis    se
5.84    0.83    4.30    7.90    3.60    0.31    -0.61   0.07
3.06    0.44    2.00    4.40    2.40    0.31    0.14    0.04
3.76    1.77    1.00    6.90    5.90    -0.27   -1.42   0.14
1.20    0.76    0.10    2.50    2.40    -0.10   -1.36   0.06
Created on 2021-12-11 by the reprex package (v2.0.1)

Solution

  • I have created two functions, mice_df and mice_cor (link to Github repo here) that will generate a correlation matrix and a table of descriptive statistics from a mids object using Rubin's Rules.

    gtsummary will neatly format models based on mids objects.

    library(mice)
    library(gtsummary)
    library(missy)
    library(dplyr)
    
    data(nhanes)
    imp <- mice(nhanes, m = 3, print = FALSE)
    mod <- with(imp, lm(age ~ bmi + chl))
    tbl_regression(as.mira(mod)) %>% as_kable()
    
    vs <- c("bmi", "chl", "age", "hyp")
    title <- "Table 1: Correlation matrix"
    mice_cor(imp = imp,
             vs = vs,
             title = title)