Search code examples
rfunctiongroup-bydplyrnse

Using dplyr group_by in a function


I am trying to use dplyr's group_by in a local function, example:

testFunction <- function(df, x) {
  df %>%
group_by(x) %>%
summarize(mean.Petal.Width = mean(Petal.Width))
}

testFunction(iris, Species)

and I get an error "... unknown variable to group by: x" I've tried group_by_ and it gives me a summary of the entire dataset. Anybody have a clue how I can fix this?

Thanks in advance!


Solution

  • Here is one way to work with the new enquo from dplyr, where enquo takes the string and converts to quosure which gets evaluated by unquoting (UQ or !!) in group_by, mutate, summarise etc.

    library(dplyr)
    testFunction <- function(df, x) {
     x <- enquo(x)
      df %>%
        group_by(!! x) %>%
         summarize(mean.Petal.Width = mean(Petal.Width))
     }
    
    testFunction(iris, Species)
    # A tibble: 3 x 2
    #     Species mean.Petal.Width
    #      <fctr>            <dbl>
    #1     setosa            0.246
    #2 versicolor            1.326
    #3  virginica            2.026