Passing multiple columns from function's argument to group_by

Consider the following example:

library(tidyverse)

df <- tibble(
  cat = rep(1:2, times = 4, each = 2),
  loc = rep(c("a", "b"), each = 8),
  value = rnorm(16)
)

df %>% 
  group_by(cat, loc) %>% 
  summarise(mean = mean(value), .groups = "drop")

# # A tibble: 4 x 3
# cat loc     mean
# * <int> <chr>  <dbl>
# 1     1 a     -0.563
# 2     1 b     -0.394
# 3     2 a      0.159
# 4     2 b      0.212

I would like to make a function of the last two lines that takes a group argument to pass multiple columns to group_by.

Here's a dummy function that computes the mean values by a group of columns as an example:

group_mean <- function(data, col_value, group) {
  data %>% 
    group_by(across(all_of(group))) %>% 
    summarise(mean = mean({{col_value}}), .groups = "drop")
}

group_mean(df, value, c("cat", "loc"))

# # A tibble: 4 x 3
# cat loc     mean
# * <int> <chr>  <dbl>
# 1     1 a     -0.563
# 2     1 b     -0.394
# 3     2 a      0.159
# 4     2 b      0.212

The function works but I would prefer a tidyselect/rlang approach to avoid quoting column names, like so:

group_mean(df, value, c(cat, loc))

# Error: Problem adding computed columns in `group_by()`.
# x Problem with `mutate()` input `..1`.
# x object 'loc' not found
# ℹ Input `..1` is `across(all_of(c(cat, loc)))`.

Enclosing group in {{}} works for a single column but not for multiple columns. How can I do that?

Solution

Consider using ... and then we can have the option to use either quoted or unquoted after converting to symbol with ensym

group_mean <- function(data, col_value, ...) {
   data %>% 
     group_by(!!! ensyms(...)) %>% 
     summarise(mean = mean({{col_value}}), .groups = "drop")
 }

-testing

> group_mean(df, value, cat, loc)
# A tibble: 4 x 3
    cat loc     mean
  <int> <chr>  <dbl>
1     1 a      0.327
2     1 b     -0.291
3     2 a     -0.382
4     2 b     -0.320
> group_mean(df, value, 'cat', 'loc')
# A tibble: 4 x 3
    cat loc     mean
  <int> <chr>  <dbl>
1     1 a      0.327
2     1 b     -0.291
3     2 a     -0.382
4     2 b     -0.320

If we are already using ... as other arguments, then an option is

group_mean <- function(data, col_value, group) {
  grp_lst <- as.list(substitute(group))
  if(length(grp_lst)> 1) grp_lst <- grp_lst[-1]
  grps <- purrr::map_chr(grp_lst, rlang::as_string)
  data %>% 
     group_by(across(all_of(grps))) %>% 
     summarise(mean = mean({{col_value}}), .groups = "drop")
}

-testing

> group_mean(df, value, c(cat, loc))
# A tibble: 4 x 3
    cat loc     mean
  <int> <chr>  <dbl>
1     1 a      0.327
2     1 b     -0.291
3     2 a     -0.382
4     2 b     -0.320