Add Group Subheader and Subtotal Rows to data.frame or table in R

Objective

I wish to add subheader and subtotal/margin rows within a table. Ultimately, I am looking for a structure shown below, which I will export to Excel with openxlsx and writeData.

	2019	2020	2021
A
A1	1001	1157	911
A2	1005	803	1110
A3	1125	897	1190
Total A	3131	2857	3211
B
B1	806	982	1098
B2	1106	945	1080
B3	1057	1123	867
Total B	2969	3050	3045
C
C1	847	1087	1140
C2	1146	966	1176
C3	1071	915	892
Total C	3064	2968	3208
Total All	9164	8875	9464

I suspect the subheaders and subtotals are completely different questions, but I am asking both here in case there is a common method related to each.

Reproducible Code So Far

Create the Sample Data (long format):

d <- data.frame(
  year = rep(c(2019, 2020, 2021), times = 9),
  sector = rep(c("A","B","C"),each = 9),
  subsector = paste0(rep(c("A","B","C"),each = 9), rep(c("1","2","3"), each = 3)),
  value = sample(800:1200, 27, replace = FALSE)
)

Output:

head(d)
#>   year sector subsector value
#> 1 2019      A        A1  1001
#> 2 2020      A        A1  1157
#> 3 2021      A        A1   911
#> 4 2019      A        A2  1005
#> 5 2020      A        A2   803
#> 6 2021      A        A2  1110

Format wide and add a margin (total) row:

library(janitor)
#[snip]warnings[/snip]
library(tidyverse)
#[snip]warnings[/snip]

d %>%
    group_by(year, sector, subsector) %>%
    summarise(sales = sum(value, na.rm = TRUE)) %>% 
    pivot_wider(names_from = year, values_from = sales) %>%
    janitor::adorn_totals(where = "row")

Output:

#> `summarise()` has grouped output by 'year', 'sector'. You can override using the `.groups` argument.
#>  sector subsector 2019 2020 2021
#>       A        A1 1001 1157  911
#>       A        A2 1005  803 1110
#>       A        A3 1125  897 1190
#>       B        B1  806  982 1098
#>       B        B2 1106  945 1080
#>       B        B3 1057 1123  867
#>       C        C1  847 1087 1140
#>       C        C2 1146  966 1176
#>       C        C3 1071  915  892
#>   Total         - 9164 8875 9464

^{Created on 2022-03-02 by the reprex package (v2.0.1)}

The janitor package's adorn_totals() function works well for adding a margin row or column for the entire set. And Sam Firke's response here hints at a solution using tidyr::gather but my data is in a different format. I don't want to "gather" the columns. Others in the same thread show solutions but they place all the totals at the end of the table.

I can imagine a solution where I loop through the sector factors and assemble and combine tables for each sector, but I suspect I am overthinking this and there is a simpler solution.

Is there an existing solution for this objective, or ideas on accomplishing this efficiently/universally?

Please Note: the number of subsectors per sector will vary in the actual data (i.e., some may have only one subsector, others may have several), and there is no naming convention relating the subsector to the sector (i.e., the parent sector will not be part of the child subsectors name: rather than Sector: "A", Subsector: "A1", it might be Sector: "Manufacturing", Subsector: "Cars").

@akrun -- Solution!

Your answer got me 90% of the way there and your subsequent comments lead me to the remaining solution.

gt has a function as_raw_html() which, using xml2::read_html() and rvest::html_table() convert the gt() object to a tibble while keeping the subheaders.

library(dplyr)
library(tidyr)
library(purrr)
library(gt)
library(xml2)
library(rvest)

d <- data.frame(
  year = rep(c(2019, 2020, 2021), times = 9),
  sector = rep(c("A","B","C"),each = 9),
  subsector = paste0(rep(c("A","B","C"),each = 9), rep(c("1","2","3"), each = 3)),
  value = sample(800:1200, 27, replace = FALSE)
)

d %>%
  group_by(year, sector, subsector) %>%
  summarise(sales = sum(value, na.rm = TRUE), .groups = 'drop') %>% 
  pivot_wider(names_from = year, values_from = sales) %>%
  group_by(sector) %>%
  group_modify(~ .x %>% adorn_totals(where = "row")) %>%
  gt() %>% 
  gt::as_raw_html() %>% 
  xml2::read_html() %>% 
  rvest::html_table()
#> [[1]]
#> # A tibble: 15 x 4
#>    subsector `2019` `2020` `2021`
#>    <chr>     <chr>  <chr>  <chr> 
#>  1 A         A      A      A     
#>  2 A1        932    1117   800   
#>  3 A2        925    1078   1090  
#>  4 A3        816    1058   1146  
#>  5 Total     2673   3253   3036  
#>  6 B         B      B      B     
#>  7 B1        862    1181   947   
#>  8 B2        1083   812    912   
#>  9 B3        1079   1130   1097  
#> 10 Total     3024   3123   2956  
#> 11 C         C      C      C     
#> 12 C1        966    895    944   
#> 13 C2        970    1147   1166  
#> 14 C3        1043   1116   826   
#> 15 Total     2979   3158   2936

^{Created on 2022-03-02 by the reprex package (v2.0.1)}

The subheader rows repeat the sector name in all columns; other than that, it looks good.

Interestingly, rvest also has a read_html function that might even reference the xml2::read_html() function, but it did not work in this context.

Solution

Instead of applying adorn_totals on the entire summary, use group_modify and then convert to gt

library(dplyr)
library(tidyr)
library(purrr)
library(janitor)
library(gt)
d %>%
  group_by(year, sector, subsector) %>%
  summarise(sales = sum(value, na.rm = TRUE), .groups = 'drop') %>% 
  pivot_wider(names_from = year, values_from = sales) %>%
  group_by(sector) %>%
  group_modify(~ .x %>% adorn_totals(where = "row")) %>%
  
  gt()

-output

An option is also to split the column with expss

library(expss)
library(openxlsx)
out <- d %>%
  group_by(year, sector, subsector) %>%
  summarise(sales = sum(value, na.rm = TRUE), .groups = 'drop') %>% 
  pivot_wider(names_from = year, values_from = sales) %>%
  group_by(sector) %>%
  group_modify(~ .x %>% adorn_totals(where = "row")) %>%
  ungroup %>%
  split_columns(columns = 1) 
wb <- createWorkbook()
sh <- addWorksheet(wb, "Tables")
xl_write(out, wb, sh)
saveWorkbook(wb, file.path(getwd(), "Documents/table1.xlsx"), overwrite = TRUE)

-output