I am trying to create a table with descriptive statistics for the whole sample as well as subgroups. My goal is to use the wonderful modelsummary R package to return one table with mean, sd, min, median, max, and graphs for the variables calculated for the entire sample as well as mean and sd for every group. I was able to achieve this with two separate tables. However, I would like to have all this information in a single table with the statistics about the entire sample first (see Fig. 1) and the subgroups second (see Fig. 2). If possible, I would also like to add the first-level heading for the whole sample and name it "All" or "Entire sample." Lastly, given that journals in my field require the use of the APA style, I wonder if the table can be turned into this format (e.g., with all the required borders, text in black instead of grey, etc.) (see Fig. 3). If modelsummary does not handle this, I am also open to trying other packages. Thanks so much to anyone who will help!
library(palmerpenguins)
library(tidyverse)
library(kableExtra)
library(modelsummary)
penguins <- penguins %>% as.data.frame() %>% select(species, bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g)
#scale variables for histogram and boxplot
pen_scaled <- penguins %>% select(bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g) %>%
mutate(across(where(is.numeric), ~scale(.))) %>% as.data.frame()
# create a list with individual variables and remove missing
pen_list <- lapply(pen_scaled, na.omit)
# create a table with `datasummary`
# add a histogram with column_spec and spec_hist
# add a boxplot with colun_spec and spec_box
emptycol <- function(x) " "
pen_table <- datasummary(All(penguins) ~ Mean + SD + Min + Median + Max + Heading("Boxplot") * emptycol + Heading("Histogram") * emptycol, data = penguins) %>%
column_spec(column = 7, image = spec_boxplot(pen_list)) %>%
column_spec(column = 8, image = spec_hist(pen_list))
pen_table
Figure 1
pen_table2 <- datasummary_balance(~species, data = penguins, dinm = FALSE)
pen_table2
Figure 2
Figure 3
There are two questions here:
The appearance is extremely configurable using the kableExtra
package (modelsummary
also supports gt
, huxtable
, and flextable
through the output
argument). An easy way to change the look is to use the kable_classic()
function from kableExtra
, as illustrated below. If you have more specific needs, please refer to the kableExtra
documentation:
As noted in the documentation for datasummary()
, you can use a 1
to indicate the "full sample". Here is a minimal example:
library(palmerpenguins)
library(tidyverse)
library(kableExtra)
library(modelsummary)
penguins <- penguins %>% as.data.frame() %>% select(species, bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g)
# scale variables for histogram and boxplot
pen_scaled <- penguins %>% select(bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g) %>%
mutate(across(where(is.numeric), ~scale(.))) %>% as.data.frame()
pen_list <- lapply(pen_scaled, na.omit)
emptycol <- function(x) " "
datasummary(All(penguins) ~ Heading("Entire sample") * 1 * (Mean + SD + Min + Median + Max + Heading("Boxplot") * emptycol + Heading("Histogram") * emptycol) + species * (Mean + SD),
data = penguins) %>%
column_spec(column = 7, image = spec_boxplot(pen_list)) %>%
column_spec(column = 8, image = spec_hist(pen_list)) %>%
kable_classic()