Search code examples
rtidyversegt

How do I merge columns using gt on summary rows?


Background and Question

I'm using the gt package in R to create presentation quality tables. For the table I am creating, I am trying to show both the counts and the percentage of the total that the counts represent in a condensed format. How do I apply the results seen on the individual country lanes to the subtotal and grand total rows?

Data

library(tidyverse)
library(gt)

df <- tibble(country_lane = c("CA-US", "MX-CA", "MX-US", "US-CA", 
                                "US-MX", "US-US"), 
                     `Above Market` = c(2L, 3L, 33L, 3L, 3L, 54L
                                ), 
                     `Below Market` = c(18L, 0L, 2L, 14L, 3L, 370L), 
                     `In Market` = c(15L, 0L, 25L, 15L, 3L, 240L),
                 BU = c("US", "CAN", "US", "CAN", "US", "US"))
> df  
# A tibble: 6 × 5
  country_lane `Above Market` `Below Market` `In Market` BU   
  <chr>                 <int>          <int>       <int> <chr>
1 CA-US                     2             18          15 US   
2 MX-CA                     3              0           0 CAN  
3 MX-US                    33              2          25 US   
4 US-CA                     3             14          15 CAN  
5 US-MX                     3              3           3 US   
6 US-US                    54            370         240 US   

Working the Problem....

This part works:

I'm open to suggestions how to do this first part better, but I'm generally happy this works as is to create the percentage columns and groupings for subtotal and total.

mkt_cols <- c("Below Market", "In Market", "Above Market")  # Used for across() below

denom <- sum(df$`Below Market` + df$`In Market` + df$`Above Market`)  # Was hoping to find an easier way to employ mkt_cols here in case in the future I have more columns, but this is okay to get the denominator.

gt_out1 <- df %>% 
mutate(across(all_of(mkt_cols), ~ .x / denom , 
              .names = "{.col}_pct" )) %>%
gt(rowname_col = c("country_lane", "Subtotal"),  groupname_col = "BU") %>%
summary_rows(columns = c("Below Market", "In Market", "Above Market"),
             fns = list("Subtotal" = ~sum(.))) %>%
summary_rows(columns = ends_with("_pct"),
             fns = list("Subtotal" = ~sum(.)),
             fmt = ~ fmt_percent(., decimals = 1)) %>% 
grand_summary_rows(columns = c("Below Market", "In Market", "Above Market"),
                   fns = list("Total" = ~sum(.))) %>%
grand_summary_rows(columns = ends_with("_pct"),
                   fns = list("Total" = ~sum(.)),
                   fmt = ~ fmt_percent(., decimals = 1)) %>% 
fmt_percent(columns = ends_with("_pct"),
            decimals = 1)

Table that is incomplete, but directionally correct so far.

But here's my problem:

I wanted to do a cols_merge() to show the percentages inside parentheses directly next to the counts and condense the table. This works great for the individual rows, but I cannot get it to apply to the subtotal or total rows. Here's the code I have and the output I don't like.

gt_out2 <- gt_out1 %>% 
cols_merge(columns = c("Below Market", "Below Market_pct"),
           pattern = "{1} ({2})") %>% 
cols_merge(columns = c("In Market", "In Market_pct"),
           pattern = "{1} ({2})") %>% 
cols_merge(columns = c("Above Market", "Above Market_pct"),
           pattern = "{1} ({2})")

Percentages missing on the subtotal and total rows.

Closing

I did look at gtsummary which looks like it would do something similar to what I'm asking here, but I am just learning gt() and it seems like I might have better control over the formatting here, plus I couldn't quite figure out the aggregation I needed for the percentages and layout with gtsummary. Any help or suggestions are appreciated. Thank you.


Solution

  • A data.frame is generated same way you pointed in your question. Note that the column names contains underscores instead spaces:

    library(tidyverse)
    library(gt)
    
    df <- tibble(country_lane = c("CA-US", "MX-CA", "MX-US", "US-CA", 
                                  "US-MX", "US-US"), 
                 Above_Market = c(2L, 3L, 33L, 3L, 3L, 54L
                 ), 
                 Below_Market = c(18L, 0L, 2L, 14L, 3L, 370L), 
                 In_Market = c(15L, 0L, 25L, 15L, 3L, 240L),
                 BU = c("US", "CAN", "US", "CAN", "US", "US"))
    

    Columns for the percentages from the total is calculated similar way you did it.

    total <- sum(df %>% select(Above_Market,Below_Market,In_Market))
    df.pct <- df %>% mutate(across(c("Above_Market","Below_Market","In_Market"), ~ .x/total, .names = "{.col}_pct"))
    

    After a pivot_longer transformation of the data.frame, Subtotal and Total rows are calculated.

    df.long <- df.pct %>% pivot_longer(-c(country_lane,BU),names_to = "Market", values_to = "values")
    df.subtotal <- rbind(df.long, 
                         df.long %>% group_by(Market,BU) %>% summarise(values=sum(values)) %>% mutate(country_lane="Subtotal"),
                         df.long %>% group_by(Market) %>% summarise(values=sum(values)) %>% mutate(country_lane="Total",BU=NA_character_)) %>%
                         pivot_wider(names_from = "Market", values_from = "values") 
    

    Here I show how would be the last rows of df.subtotal data.frame

    tail(df.subtotal)
    # A tibble: 6 × 8
      country_lane BU    Above_Market Below_Market In_Market Above_Market_pct Below_Market_pct In_Market_pct
      <chr>        <chr>        <dbl>        <dbl>     <dbl>            <dbl>            <dbl>         <dbl>
    1 US-CA        CAN              3           14        15          0.00374          0.0174        0.0187 
    2 US-MX        US               3            3         3          0.00374          0.00374       0.00374
    3 US-US        US              54          370       240          0.0672           0.461         0.299  
    4 Subtotal     CAN              6           14        15          0.00747          0.0174        0.0187 
    5 Subtotal     US              92          393       283          0.115            0.489         0.352  
    6 Total        NA              98          407       298          0.122            0.507         0.371  
    

    gt object is built removing Total row and considering BU column as groupname_col:

    gt_out1 <- df.subtotal %>% filter(country_lane != "Total") %>% 
      gt(rowname_col = c("country_lane"),  groupname_col = "BU") %>%
      rows_add(.list=as.list(df.subtotal %>% filter(country_lane=="Total"))) %>%
      fmt_percent(columns = ends_with("_pct"),
                  decimals = 1)
    

    Finally, cols_merge transformation is done

    gt_out2 <- gt_out1 %>% 
      cols_merge(columns = c("Below_Market", "Below_Market_pct"),
                 pattern = "{1} ({2})") %>% 
      cols_merge(columns = c("In_Market", "In_Market_pct"),
                 pattern = "{1} ({2})") %>% 
      cols_merge(columns = c("Above_Market", "Above_Market_pct"),
                 pattern = "{1} ({2})")