I'm trying to generate a neat table using the kableExtra / gt packages (or anything that works), with a goal of having my value columns grouped by the different Variables:
data <- data.frame(Name = c("Mary","Mary","Mary","Jane","Jane","Jane"),
Variables = c(letters[1:3],letters[1:3]),
Count = c(45, 76, 43, 23, 11, 46),
Percent = c(0.45, 0.56, 0.89, 0.65, 0.88, 0.91)) %>% arrange(Name, Variables)
# Desired output:
# a b c
# Count Percent | Count Percent | Count Percent
# Mary 45 45% 76 56% 43 89%
# Jane 23 65% 11 88% 46 91%
I can't seem to figure out how to go this, and the closest I can come is:
library(gt)
gt(data, rowname_col = "Variables", groupname_col = "Name")
I'm taking inspiration that this can be done from the graph shown in this page: table with output columns grouped by year variable
Thanks for your help!
UPDATE: Solved based on post from Stefan:
data <- data.frame(Name = c("Mary","Mary","Mary","Jane","Jane","Jane"),
Variables = c(letters[1:3],letters[1:3]),
Count = c(45, 76, 43, 23, 11, 46),
Percent = c(0.45, 0.56, 0.89, 0.65, 0.88, 0.91)) %>%
arrange(desc(Name), Variables)
# Helper to put the columns in the right order
cols_order <- unlist(lapply(c("a", "b", "c"), function(x) paste(x, c("Count", "Percent"), sep = "_")))
data_wide <- data %>%
pivot_wider(names_from = "Variables", values_from = c(Count, Percent), names_glue = "{Variables}_{.value}") %>%
# Reorder columns
select(all_of(c("Name", cols_order)))
data_wide %>%
gt(rowname_col = "Name") %>%
tab_spanner_delim(delim = "_") %>%
fmt_percent(ends_with("Percent"), decimals = 0)
To achieve your desired result you could first reshape your data to wide format using e.g. tidy::pivot_wider
. Next step is to put the columns in the right order. To this end I reorder the cols of the df but this could probably also be done via gt
. The remainder is styling the table. To group by Variables
you could make use of tab_spanner_delim
and get nicely formatted percentages via fmt_percent
:
EDIT Thanks to @Shoesoff for pointing out that my original solution could be simplified considerably by making use of tab_spanner_delim
instead of tab_spanner
.
Improved answer
library(gt)
library(tidyr)
library(dplyr)
data <- data.frame(Name = c("Mary","Mary","Mary","Jane","Jane","Jane"),
Variables = c(letters[1:3],letters[1:3]),
Count = c(45, 76, 43, 23, 11, 46),
Percent = c(0.45, 0.56, 0.89, 0.65, 0.88, 0.91)) %>%
arrange(desc(Name), Variables)
# Helper to put the columns in the right order
cols_order <- unlist(lapply(c("a", "b", "c"), function(x) paste(x, c("Count", "Percent"), sep = "_")))
data_wide <- data %>%
pivot_wider(names_from = "Variables", values_from = c(Count, Percent), names_glue = "{Variables}_{.value}") %>%
# Reorder columns
select(all_of(c("Name", cols_order)))
data_wide %>%
gt(data, rowname_col = "Name") %>%
tab_spanner_delim(
delim = "_"
) %>%
fmt_percent(ends_with("Percent"), decimals = 0)