Search code examples
rfunctiondplyrnon-standard-evaluation

R user-defined/dynamic summary function within dplyr::summarise


Somewhat hard to define this question without sounding like lots of similar questions!

I have a function for which I want one of the parameters to be a function name, that will be passed to dplyr::summarise, e.g. "mean" or "sum":

data(mtcars)
  f <- function(x = mtcars,
                groupcol = "cyl",
                zCol = "disp",
                zFun = "mean") {
    
    zColquo = quo_name(zCol)
    
    cellSummaries <- x %>%
      group_by(gear, !!sym(groupcol)) %>% # 1 preset grouper, 1 user-defined
      summarise(Count = n(), # 1 preset summary, 1 user defined
                !!zColquo := mean(!!sym(zColquo))) # mean should be zFun, user-defined
    ungroup
  }

(this groups by gear and cyl, then returns, per group, count and mean(disp))

Per my note, I'd like 'mean' to be dynamic, performing the function defined by zFun, but I can't for the life of me work out how to do it! Thanks in advance for any advice.


Solution

  • You can use match.fun to make the function dynamic. I also removed zColquo as it's not needed.

    library(dplyr)
    library(rlang)
    
    f <- function(x = mtcars,
                  groupcol = "cyl",
                  zCol = "disp",
                  zFun = "mean") {
    
      cellSummaries <- x %>%
                       group_by(gear, !!sym(groupcol)) %>% 
                       summarise(Count = n(), 
                                 !!zCol := match.fun(zFun)(!!sym(zCol))) %>%
                       ungroup
    
      return(cellSummaries)
    }
    

    You can then check output

    f()
    
    # A tibble: 8 x 4
    #   gear   cyl Count  disp
    #  <dbl> <dbl> <int> <dbl>
    #1     3     4     1  120.
    #2     3     6     2  242.
    #3     3     8    12  358.
    #4     4     4     8  103.
    #5     4     6     4  164.
    #6     5     4     2  108.
    #7     5     6     1  145 
    #8     5     8     2  326 
    
    f(zFun = "sum")
    
    # A tibble: 8 x 4
    #   gear   cyl Count  disp
    #  <dbl> <dbl> <int> <dbl>
    #1     3     4     1  120.
    #2     3     6     2  483 
    #3     3     8    12 4291.
    #4     4     4     8  821 
    #5     4     6     4  655.
    #6     5     4     2  215.
    #7     5     6     1  145 
    #8     5     8     2  652