Search code examples
rkableextragtmodelsummary

Display statistics for the whole sample as well as subgroups with modelsummary or similar packages


I am trying to create a table with descriptive statistics for the whole sample as well as subgroups. My goal is to use the wonderful modelsummary R package to return one table with mean, sd, min, median, max, and graphs for the variables calculated for the entire sample as well as mean and sd for every group. I was able to achieve this with two separate tables. However, I would like to have all this information in a single table with the statistics about the entire sample first (see Fig. 1) and the subgroups second (see Fig. 2). If possible, I would also like to add the first-level heading for the whole sample and name it "All" or "Entire sample." Lastly, given that journals in my field require the use of the APA style, I wonder if the table can be turned into this format (e.g., with all the required borders, text in black instead of grey, etc.) (see Fig. 3). If modelsummary does not handle this, I am also open to trying other packages. Thanks so much to anyone who will help!

library(palmerpenguins)
library(tidyverse)
library(kableExtra)
library(modelsummary)

penguins <- penguins %>% as.data.frame() %>% select(species, bill_length_mm, bill_depth_mm,  flipper_length_mm, body_mass_g)

#scale variables for histogram and boxplot
pen_scaled <- penguins %>% select(bill_length_mm, bill_depth_mm,  flipper_length_mm, body_mass_g) %>% 
  mutate(across(where(is.numeric), ~scale(.))) %>% as.data.frame()

# create a list with individual variables and remove missing
pen_list <- lapply(pen_scaled, na.omit)

# create a table with `datasummary`
# add a histogram with column_spec and spec_hist
# add a boxplot with colun_spec and spec_box
emptycol <- function(x) " "
pen_table <- datasummary(All(penguins) ~ Mean + SD + Min + Median + Max + Heading("Boxplot") * emptycol + Heading("Histogram") * emptycol, data = penguins) %>%
    column_spec(column = 7, image = spec_boxplot(pen_list)) %>%
    column_spec(column = 8, image = spec_hist(pen_list))

pen_table

Figure 1

Figure 1

pen_table2 <- datasummary_balance(~species, data = penguins, dinm = FALSE)

pen_table2

Figure 2

Figure 2

Figure 3

Figure 3


Solution

  • There are two questions here:

    1. How to change the appearance of the table?
    2. How to create a table with a given shape (yet to be specifically defined)?

    Question 1

    The appearance is extremely configurable using the kableExtra package (modelsummary also supports gt, huxtable, and flextable through the output argument). An easy way to change the look is to use the kable_classic() function from kableExtra, as illustrated below. If you have more specific needs, please refer to the kableExtra documentation:

    Question 2

    As noted in the documentation for datasummary(), you can use a 1 to indicate the "full sample". Here is a minimal example:

    library(palmerpenguins)
    library(tidyverse)
    library(kableExtra)
    library(modelsummary)
    
    penguins <- penguins %>% as.data.frame() %>% select(species, bill_length_mm, bill_depth_mm,  flipper_length_mm, body_mass_g)
    
    # scale variables for histogram and boxplot
    pen_scaled <- penguins %>% select(bill_length_mm, bill_depth_mm,  flipper_length_mm, body_mass_g) %>% 
      mutate(across(where(is.numeric), ~scale(.))) %>% as.data.frame()
    pen_list <- lapply(pen_scaled, na.omit)
    
    emptycol <- function(x) " "
    datasummary(All(penguins) ~ Heading("Entire sample") * 1 * (Mean + SD + Min + Median + Max + Heading("Boxplot") * emptycol + Heading("Histogram") * emptycol) + species * (Mean + SD),
                data = penguins) %>%
        column_spec(column = 7, image = spec_boxplot(pen_list)) %>%
        column_spec(column = 8, image = spec_hist(pen_list)) %>%
        kable_classic()
    

    enter image description here