Search code examples
rgroup-bydplyrsplit-apply-combine

Programmatically calling group_by() on a varying variable


Using dplyr, I'd like to summarize [sic] by a variable that I can vary (e.g. in a loop or apply-style command).

Typing the names in directly works fine:

library(dplyr)
ChickWeight %>% group_by( Chick, Diet ) %>% summarise( mw = mean( weight ) )

But group_by wasn't written to take a character vector, so passing in results is harder.

v <- "Diet"
ChickWeight %>% group_by( c( "Chick", v ) ) %>% summarise( mw = mean( weight ) )
## Error

I'll post one solution, but curious to see how others have solved this.


Solution

  • The underscore functions of dplyr could be useful for that:

    ChickWeight %>% group_by_( "Chick", v )  %>% summarise( mw = mean( weight ) )
    

    From the new features in dplyr 0.3:

    You can now program with dplyr – every function that uses non-standard evaluation (NSE) also has a standard evaluation (SE) twin that ends in _. For example, the SE version of filter() is called filter_(). The SE version of each function has similar arguments, but they must be explicitly “quoted”.