Search code examples
rkablekableextragt

Trying to make a table in R where I group columns by variables of a vector


I'm trying to generate a neat table using the kableExtra / gt packages (or anything that works), with a goal of having my value columns grouped by the different Variables:

data <- data.frame(Name = c("Mary","Mary","Mary","Jane","Jane","Jane"),
                   Variables = c(letters[1:3],letters[1:3]),
                   Count = c(45, 76, 43, 23, 11, 46),
                   Percent = c(0.45, 0.56, 0.89, 0.65, 0.88, 0.91)) %>% arrange(Name, Variables)

# Desired output:
#             a              b               c
#       Count Percent | Count Percent | Count Percent
# Mary   45    45%       76    56%       43    89%
# Jane   23    65%       11    88%       46    91%    

I can't seem to figure out how to go this, and the closest I can come is:

library(gt)
gt(data, rowname_col = "Variables", groupname_col = "Name")

output

I'm taking inspiration that this can be done from the graph shown in this page: table with output columns grouped by year variable

Thanks for your help!

UPDATE: Solved based on post from Stefan:

data <- data.frame(Name = c("Mary","Mary","Mary","Jane","Jane","Jane"),
                   Variables = c(letters[1:3],letters[1:3]),
                   Count = c(45, 76, 43, 23, 11, 46),
                   Percent = c(0.45, 0.56, 0.89, 0.65, 0.88, 0.91)) %>% 
        arrange(desc(Name), Variables)

# Helper to put the columns in the right order
cols_order <- unlist(lapply(c("a", "b", "c"), function(x) paste(x, c("Count", "Percent"), sep = "_")))

data_wide <- data %>% 
        pivot_wider(names_from = "Variables", values_from = c(Count, Percent), names_glue = "{Variables}_{.value}") %>% 
        # Reorder columns
        select(all_of(c("Name", cols_order)))

data_wide %>% 
        gt(rowname_col = "Name") %>%
        tab_spanner_delim(delim = "_") %>%
        fmt_percent(ends_with("Percent"), decimals = 0)

Solution

  • To achieve your desired result you could first reshape your data to wide format using e.g. tidy::pivot_wider. Next step is to put the columns in the right order. To this end I reorder the cols of the df but this could probably also be done via gt. The remainder is styling the table. To group by Variables you could make use of tab_spanner_delim and get nicely formatted percentages via fmt_percent:

    EDIT Thanks to @Shoesoff for pointing out that my original solution could be simplified considerably by making use of tab_spanner_delim instead of tab_spanner.

    Improved answer

    library(gt)
    library(tidyr)
    library(dplyr)
    
    data <- data.frame(Name = c("Mary","Mary","Mary","Jane","Jane","Jane"),
                       Variables = c(letters[1:3],letters[1:3]),
                       Count = c(45, 76, 43, 23, 11, 46),
                       Percent = c(0.45, 0.56, 0.89, 0.65, 0.88, 0.91)) %>% 
      arrange(desc(Name), Variables)
    
    # Helper to put the columns in the right order
    cols_order <- unlist(lapply(c("a", "b", "c"), function(x) paste(x, c("Count", "Percent"), sep = "_")))
    
    data_wide <- data %>% 
      pivot_wider(names_from = "Variables", values_from = c(Count, Percent), names_glue = "{Variables}_{.value}") %>% 
      # Reorder columns
      select(all_of(c("Name", cols_order)))
    
    data_wide %>% 
      gt(data, rowname_col = "Name") %>% 
      tab_spanner_delim(
        delim = "_"
      ) %>% 
      fmt_percent(ends_with("Percent"), decimals = 0)
    

    enter image description here