Search code examples
rgt

gt: Is it possible to programmatically exclude rows from summary functions?


I am using the gt package in R to create a table which has what I call "selective subgroup breakdowns". Here is a toy dataset which demonstrates the issue:

library(gt)
library(tibble)

df = tribble(
  ~group, ~name, ~is_breakdown, ~num, ~percent,
  "A",    "percent_happy"              , FALSE, 50, .5,
  "A",    "percent_of_happy_very_happy", TRUE , 5 , .1,
  "A",    "percent_sad"                , FALSE, 50, .5,
  "B",    "percent_happy"              , FALSE, 50, .5,
  "B",    "percent_sad"                , FALSE, 50, .5,
  "B",    "percent_of_sad_very_sad"    , TRUE , 10, .2
)

gt(df, groupname_col = "group", rowname_col = "name") |>
  tab_stub_indent(rows = df$is_breakdown, indent = 4) |>
  summary_rows(columns = c("num", "percent"), 
               fns = list(label = "Sub-Total", id = "sub-total") ~ sum(.)) |>
  grand_summary_rows(columns = c("num", "percent"), 
                     fns = list(label = "Grand Total", id = "grand-total") ~ sum(.)) 

This generates the following table:

enter image description here

The dataset has 100 people in each group. But each group has a (different) separate row which is intended to be read "By the way, a certain fraction of this group has this other property you should know about."

In my real situation, I am trying to handle breakouts like this programmatically. The actual data is simply not known ahead of time. Like in the example, I use a separate is_breakdown column to do a selective indent. I would like to somehow use that column to have the summary and grand_summary rows skip the breakdown row. In this example, each summary row should have 100 people and the grand total should have 200 people. Is that possible?

If that is not possible, is it possible to manually create the summary and grand_summary rows myself, and somehow manually add them?


Solution

  • Simply pass your is_breakdown column to the aggregation function and use it accordingly:

    g <- function(., keep) {
      sum(.[keep])
    }
    
    gt(df, groupname_col = "group", rowname_col = "name") |>
      tab_stub_indent(rows = df$is_breakdown, indent = 4) |>
      summary_rows(columns = c("num", "percent"),# group = "my_group", 
                   fns = list(label = "Sub-Total", id = "sub-total") ~ g(., !is_breakdown)) |>
      grand_summary_rows(columns = c("num", "percent"), 
                         fns = list(label = "Grand Total", id = "grand-total") ~ g(., !is_breakdown))
    

    Table with appropriate Sub- and Grand Totals