Search code examples
rfunctionrlang

Pass in function with parameters to function arguement in r


I'm struggling to understand all of the nuances when passing in something to a function in R. The rlang package and how to use it is confusing to me. Can't find a good guide on when to use all the various rlang::sym or related functions.

Anyways, I'm trying to make a function that allows the passing in of a user defined function and the related parameters. For example means, quantiles, etc. I want the user_metric to always be in quotes and also it needs to be able to contain the various arguments itself such as na.rm = TRUE & so forth. Can someone show me how this might work so whether I want to pass in 'mean' or 'mean(. , na.rm=TRUE)' both would work?

library(tidyverse)
group_by_metrics=function(data, group_col, user_metric){

metrics = data %>% group_by(!!rlang:sym(group_col)) %>% summarise_all(.funs = funs(!!rlang::syms(user_metric))

return(metrics)
}

group_by_metrics(data=mtcars, group_col='vs', user_metric='mean')
group_by_metrics(data=mtcars, group_col='vs', user_metric='mean(., na.rm = TRUE)'
group_by_metrics(data=mtcars, group_col='vs', user_metric ='quantile(., probs=0.95, na.rm = TRUE')

Solution

  • You have to distinguish between your first case, where you simply provide the function name, from your other cases, where you're effectively defining a lambda function. For the former, you can use match.fun to find the function by name. For the latter, convert your strings to formulas, then use purrr::as_mapper() to make functions out of them. Use ensym instead of sym to allow for unquoted arguments.

    group_by_metrics <- function(.data, group_col, user_metric)
    {
      f <- purrr::possibly( match.fun, NULL )(user_metric)
      if( is.null(f) )
          f <- str_c( "~", user_metric ) %>% as.formula %>% as_mapper
      .data %>% group_by(!!rlang::ensym(group_col)) %>% summarize_all( f )
    }
    
    group_by_metrics( mtcars, "vs", "quantile(., probs=0.95, na.rm=TRUE)" )
    #      vs   mpg   cyl  disp    hp  drat    wt  qsec    am  gear  carb
    #   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
    # 1     0  21.7     8  462.  275.  4.25  5.36  18.0     1  5     6.30
    # 2     1  32.9     6  237.  123   4.47  3.45  21.2     1  4.35  4   
    
    ## Using ensym instead of sym allows you to drop " for group_col
    group_by_metrics( mtcars, vs, "mean" )
    #      vs   mpg   cyl  disp    hp  drat    wt  qsec    am  gear  carb
    #   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
    # 1     0  16.6  7.44  307. 190.   3.39  3.69  16.7 0.333  3.56  3.61
    # 2     1  24.6  4.57  132.  91.4  3.86  2.61  19.3 0.5    3.86  1.79
    

    Note that you can avoid all this conversion, if you pass the additional arguments separately, using ...:

    group_by_metrics2 <- function(.data, group_col, user_metrics, ...)
    { 
      .data %>% group_by(!!rlang::ensym(group_col)) %>% 
        summarize_all( user_metrics, ... ) 
    }
    
    group_by_metrics2( mtcars, "vs", "quantile", probs=0.05, na.rm=TRUE)
    # # A tibble: 2 x 11
    #      vs   mpg   cyl  disp    hp  drat    wt  qsec    am  gear  carb
    #   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
    # 1     0  10.4   5.7 141.  107.   2.90  2.55  14.6     0     3     2
    # 2     1  18.0   4    74.1  58.5  2.97  1.58  17.8     0     3     1
    

    In the last example, the string quotes " are optional around both vs and quantile.