Search code examples
rtidyversepurrrrlang

How to Output a List of Summaries From Different Grouping Variables When Using Dplyr::Group_by and Dplyr::Summarise


library(tidyverse)

Using a simple example from the mtcars dataset, I can group by cyl and get basic counts with this...

mtcars%>%group_by(cyl)%>%summarise(Count=n())

And I can group by both cyl and am...

mtcars%>%group_by(cyl,am)%>%summarise(Count=n())

I can then create a function that will allow me to input multiple grouping variables.

Fun<-function(dat,...){
dat%>%
group_by_at(vars(...))%>%
summarise(Count=n())
}

However, rather than entering multiple grouping variables, I would like to output a list of two summaries, one for counts with cyl as the grouping variable, and one for cyl and am as the grouping variables.

I feel like something similar to the following should work, but I can't seem to figure it out. I'm hoping for an rlang or purrr solution. Help would be appreciated.

Groups<-list("cyl",c("cyl","am"))

mtcars%>%group_by(!!Groups)%>%summarise(Count=n())

Solution

  • Here's a working, tidyeval-compliant method.

    library(tidyverse)
    library(rlang)
    
    Groups <- list("cyl" ,c("cyl","am"))
    
    Groups %>%
      map(function(group) {
        syms <- syms(group)
        mtcars %>%
          group_by(!!!syms) %>%
          summarise(Count = n())
      })
    
    #> [[1]]
    #> # A tibble: 3 x 2
    #>     cyl Count
    #>   <dbl> <int>
    #> 1     4    11
    #> 2     6     7
    #> 3     8    14
    #> 
    #> [[2]]
    #> # A tibble: 6 x 3
    #> # Groups:   cyl [?]
    #>     cyl    am Count
    #>   <dbl> <dbl> <int>
    #> 1     4     0     3
    #> 2     4     1     8
    #> 3     6     0     4
    #> 4     6     1     3
    #> 5     8     0    12
    #> 6     8     1     2