Search code examples
rfilterparametersdplyrlazyeval

R lazyeval: pass parameters to dplyr::filter


I think this questions has multiple other variations (eg: here, here and perhaps here) - probably even an answer somewhere.

How to provide parameters to the filter function.

library(dplyr)
library(lazyeval)
set.seed(10)
data <- data.frame(a=sample(1:10, 100, T))

If I need to count the number of occurrences of the number 1 through 10 occurs and show the count for, say 1, 2 and 3, I would do this:

data %>% 
  group_by(a) %>% 
  summarise(n = n()) %>% 
  filter(a < 4)

Gives:

# A tibble: 3 × 2
      a     n
  <int> <int>
1     1    11
2     2     8
3     3    16

Now, how can I put this into a function ? Here grp is the grouping var.

   fun <- function(d, grp, no){
      d %>% 
        group_by_(grp) %>% 
        summarise_(n = interp(~ n() )) %>%
        filter_( grp < no)
        # This final line instead also does not work:  
        # filter_(interp( grp < no), grp = as.name(grp))
    }

Now,

fun(data, 'a', 4)

Gives:

# A tibble: 0 × 2
# ... with 2 variables: a <int>, n <int>

Solution

  • We can use the quosures approach from the devel version of dplyr (soon to be released 0.6.0)

    fun <- function(d, grp, no){
       grp <- enquo(grp)
    
       d %>% 
          group_by(UQ(grp)) %>% 
          summarise(n = n() )%>%
           filter(UQ(grp) < no)
    
    }
    
    fun(data, a, 4)
    # A tibble: 3 x 2
    #      a     n
    #  <int> <int>
    #1     1    11
    #2     2     8
    #3     3    16
    

    We use enquo to take the input argument and convert it to quosure, within the group_by/summarise/mutate, the quosure is evaluated by unquoting (UQ or !!)


    The above function can be also modified to take both quoted and unquoted arguments

    fun <- function(d, grp, no){
       lst <- as.list(match.call())
       grp <- if(is.character(lst$grp)) {
                  rlang::parse_quosure(grp)
                } else enquo(grp)
    
    
       d %>% 
          group_by(UQ(grp)) %>% 
          summarise(n = n() )%>%
           filter(UQ(grp) < no)
    
    }
    
    
    
    
    fun(data, a, 4)
    # A tibble: 3 x 2
    #      a     n
    #  <int> <int>
    #1     1    11
    #2     2     8
    #3     3    16
    
     fun(data, 'a', 4)
    # A tibble: 3 x 2
    #      a     n
    #  <int> <int>
    #1     1    11
    #2     2     8
    #3     3    16