Search code examples
rdplyr

Add margin row totals in dplyr chain


I would like to add overall summary rows while also calculating summaries by group using dplyr. I have found various questions asking how to do this, e.g. here, here, and here, but no clear solution. One possible approach is to perform count twice and bind the rows:

mtcars %>% 
  count(cyl, gear) %>% 
  bind_rows(
    count(mtcars, gear)
  )

which nearly produces what I need (the left-most column has NAs rather than 'Total' or similar):

     cyl  gear     n
   <dbl> <dbl> <int>
1      4     3     1
2      4     4     8
3      4     5     2
4      6     3     2
5      6     4     4
6      6     5     1
7      8     3    12
8      8     5     2
9     NA     3    15
10    NA     4    12
11    NA     5     5

Am I missing an easier/built-in solution?


Solution

  • With adorn_totals() from the janitor package:

    library(janitor)
    mtcars %>%
      tabyl(cyl, gear) %>%
      adorn_totals("row") 
    
       cyl  3  4 5
         4  1  8 2
         6  2  4 1
         8 12  0 2
     Total 15 12 5
    

    To get from there to the "long" form in your post, add tidyr::gather() to the pipeline:

    mtcars %>%
      tabyl(cyl, gear) %>%
      adorn_totals("row") %>%
      tidyr::gather(gear, n, 2:ncol(.), convert = TRUE)
    
         cyl gear  n
    1      4    3  1
    2      6    3  2
    3      8    3 12
    4  Total    3 15
    5      4    4  8
    6      6    4  4
    7      8    4  0
    8  Total    4 12
    9      4    5  2
    10     6    5  1
    11     8    5  2
    12 Total    5  5
    

    Self-promotion alert, I authored this package - adding this answer b/c it's a genuinely efficient solution here.