Search code examples
rdplyr

Is there a way to pass a list of filter parameters to `dplyr::filter`?


I want to filter a dataframe on the values in multiple columns, without needing to hard code the columns and values inside the dplyr::filter call. Essentially, I want to avoid this:

df_in <- data.frame(
  a = c("first", "first", "first", "first", "last", "last", "last", "last"),
  b = c("second", "second", "loser", "loser", "second", "second", "loser", "loser"), 
  c = 1:8
)
df_in
df_out <- df_in %>% 
  dplyr::filter(
    !grepl("a", a), b == "second", c < 5   ##  I want to avoid burying this in my code
  )
df_out

I want to do something like this, with an imaginary prep_function and eval_function:

filt_crit <- prep_function(!grepl("a", a), b == "second", c < 5)
df_out <- df_in %>% dplyr::filter(eval_function(filt_crit))
df_out

I can use rlang::expr to filter based on one criterion:

filt_crit1 <- rlang::expr(!grepl("a", a))
df_partial <- df_in %>% dplyr::filter(eval(filt_crit1))
df_partial

I've figured out a way to do this with purrr::reduce(dplyr::filter(...)), iterating over filt_crit:

filt_crit <- c(rlang::expr(!grepl("a", a)), rlang::expr(b == "second"), rlang::expr(c < 5))
df_out <- filt_crit %>% 
  purrr::reduce(\(acc, nxt) dplyr::filter(acc, eval(nxt)), .init = df_in)
df_out

This seems a bit clunky. Is purrr::reduce the most straightforward solution? Thanks!


Solution

  • You can achieve your desired result by wrapping your filter conditions inside rlang::exprs to create a list of expressions, then pass the conditons to dplyr::filter using the unsplice operatior !!!:

    df_in <- data.frame(
      a = c("first", "first", "first", "first", "last", "last", "last", "last"),
      b = c("second", "second", "loser", "loser", "second", "second", "loser", "loser"),
      c = 1:8
    )
    
    library(dplyr, warn = FALSE)
    
    .filt_crit <- rlang::exprs(!grepl("a", a), b == "second", c < 5)
    
    df_in |> filter(!!!.filt_crit)
    #>       a      b c
    #> 1 first second 1
    #> 2 first second 2
    
    .filt_crit <- rlang::exprs(!grepl("a", a), c < 5)
    
    df_in |> filter(!!!.filt_crit)
    #>       a      b c
    #> 1 first second 1
    #> 2 first second 2
    #> 3 first  loser 3
    #> 4 first  loser 4