Search code examples
rtidyversepurrrtidyselect

i want to write a custom function with tidyverse verbs/syntax that accepts the grouping parameters of my function as string


I want to write a function that has as parameters a data set, a variable to be grouped, and another parameter to be filtered. I want to write the function in such a way that I can afterwards apply map() to it and pass the variables to be grouped in to map() as a vector. Nevertheless, I don't know how my custom function rating() accepts the variables to be grouped as a string. This is what i have tried.

    data = tibble(a = seq.int(1:10), 
                  g1 = c(rep("blue", 3), rep("green", 3), rep("red", 4)), 
                  g2 = c(rep("pink", 2), rep("hotpink", 6), rep("firebrick", 2)),
                  na = NA, 
                  stat=c(23,43,53,2,43,18,54,94,43,87)) 
    
    rating = function(data, by, no){
      data %>%
        select(a, {{by}}, stat) %>%
        group_by({{by}}) %>%
        mutate(rank = rank(stat)) %>%
        ungroup() %>%
        filter(a == no)
    }
    
    fn(data = data, by = g2, no = 5) #this works

And this is the way i want to use my function


map(.x = c("g1", "g2"), .f = ~rating(data = data, by = .x, no = 1))

... but i get


Error: Must group by variables found in `.data`.
* Column `.x` is not found.


Solution

  • As we are passing character elements, it would be better to convert to symbol and evaluate (!!)

    library(dplyr)
    library(purrr)
    rating <- function(data, by, no){
          by <- rlang::ensym(by)
          data %>%
            select(a, !! by, stat) %>%
            group_by(!!by) %>%
            mutate(rank = rank(stat)) %>%
            ungroup() %>%
            filter(a == no)
        }
    

    -testing

    > map(.x = c("g1", "g2"), .f = ~rating(data = data, by = !!.x, no = 1))
    [[1]]
    # A tibble: 1 × 4
          a g1     stat  rank
      <int> <chr> <dbl> <dbl>
    1     1 blue     23     1
    
    [[2]]
    # A tibble: 1 × 4
          a g2     stat  rank
      <int> <chr> <dbl> <dbl>
    1     1 pink     23     1
    

    It also works with unquoted input

    > rating(data, by = g2, no = 5)
    # A tibble: 1 × 4
          a g2       stat  rank
      <int> <chr>   <dbl> <dbl>
    1     5 hotpink    43     3