Search code examples
rfunctiondplyrrlang

how to pass a column name to a function


I have written a function that does a few analysis including calculating a measure called "Net Promoter Score" using NPS package.

library(dplyr)
library(tidyr)
library(NPS)
df<-data.frame(score = sample(c(0:10),15,replace=TRUE),
           variable = sample(c('A', 'B', 'C'),15,replace=TRUE)
)
analyzer <- function(df,var, sco){
    df %>% group_by_(var) %>% transmute(n= nps(sco)) %>% unique()
}
analyzer(df,'variable','score')

This returns NA for all levels of variable.

Now dplyr functions have a way of dealing with x being handed to them as character(i.e., their _ version which I've used here), but the nps function doesn't. I also tried passing score column as nps(.[[score]]) but this returns the NPS for the whole column and doesn't break it down by group_by levels.


Solution

  • It's because the inputs to functions are not getting evaluated properly-

    (Note that the way it is implemented here, the function will work irrespective of whether you enter a bare expression x = x or a character x = "x")

    library(dplyr)
    library(tidyr)
    library(NPS)
    set.seed(123)
    
    # data
    df <- data.frame(score = sample(c(0:10), 15, replace = TRUE),
                     variable = sample(c('A', 'B', 'C'), 15, replace = TRUE))
    
    # custom function
    analyzer <- function(df, var, sco) {
      var <- rlang::ensym(var)
      sco <- rlang::ensym(sco)
    
      df <- df %>% 
        group_by(., !!rlang::enquo(var)) %>% 
        transmute(., n = NPS::nps(!!rlang::enquo(sco))) %>% 
        unique()
    
      return(df)
    }
    
    # using function
    analyzer(df, 'variable', 'score')
    #> # A tibble: 3 x 2
    #> # Groups:   variable [3]
    #>   variable      n
    #>   <fct>     <dbl>
    #> 1 C        -0.333
    #> 2 A        -0.4  
    #> 3 B        -0.25
    

    Created on 2018-11-18 by the reprex package (v0.2.1)