Search code examples
rfunctiondplyrsummarize

Use a specific column inside the user defined function for summarize function in dplyr


I have the following question. I have a dataset mtcars and I want to write a a function to summarize the given variable, e.g. mpg given that another variable has a particular value, e.g. vs = 1. I provide a code, where I want to summarize mpg given that vs = 1 and again summarize given that am = 1.

Error in NextMethod("[") : object 'vs' not found

f_1 <- function(data, var){
  
  # Quote the variable that we can use it's name
  var         <- enquo(var)
  
  data %>%
    summarize(p_1          = mean(!!var[vs  == 1], na.rm = TRUE),
              p_2          = mean(!!var[am  == 1], na.rm = TRUE))
}


f_1(data = mtcars, var = mpg)

Solution

  • You can use curly-curly ({{..}}) :

    library(dplyr)
    
    f_1 <- function(data, var){
      data %>%
        summarize(p_1 = mean({{var}}[vs  == 1], na.rm = TRUE),
                  p_2 = mean({{var}}[am  == 1], na.rm = TRUE))
    }
    
    f_1(data = mtcars, var = mpg)
    
    #      p_1      p_2
    #1 24.55714 24.39231