Search code examples
rdataframelistdplyr

Efficient use of a list for filtering in `dplyr`


My filter_list has a large number of elements. The filtering below works but how would one make the dplyr::filter more concise?

I couldn't make all_of work.

filter_list <- list(
  hair_color = c("blond", "brown"),
  skin_color = "light"
)

dplyr::starwars |> 
  dplyr::filter(
    hair_color %in% filter_list[["hair_color"]],
    skin_color %in% filter_list[["skin_color"]]
  )

Solution

  • We could use reduce2 to iteratively apply filter statements, e.g.:

    library(purrr); library(dplyr)
    
    out <- starwars |> 
      reduce2(
        .x = filter_list, .y = names(filter_list), .init = _,
        .f = \(df, x, y) filter(df, .data[[y]] %in% x)
      )
    
    # A tibble: 8 × 14
      name     height  mass hair_color skin_color eye_color birth_year sex   gender homeworld species films vehicles
      <chr>     <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr>  <chr>     <chr>   <lis> <list>  
    1 Leia Or…    150    49 brown      light      brown             19 fema… femin… Alderaan  Human   <chr> <chr>   
    2 Beru Wh…    165    75 brown      light      blue              47 fema… femin… Tatooine  Human   <chr> <chr>   
    3 Padmé A…    185    45 brown      light      brown             46 fema… femin… Naboo     Human   <chr> <chr>   
    4 Cordé       157    NA brown      light      brown             NA NA    NA     Naboo     NA      <chr> <chr>   
    5 Dormé       165    NA brown      light      brown             NA fema… femin… Naboo     Human   <chr> <chr>   
    6 Raymus …    188    79 brown      light      brown             NA male  mascu… Alderaan  Human   <chr> <chr>   
    7 Rey          NA    NA brown      light      hazel             NA fema… femin… NA        Human   <chr> <chr>   
    8 Poe Dam…     NA    NA brown      light      brown             NA male  mascu… NA        Human   <chr> <chr>
    

    Check if correct:

    all.equal(
      out, 
      dplyr::starwars |> 
        dplyr::filter(
          hair_color %in% filter_list[["hair_color"]],
          skin_color %in% filter_list[["skin_color"]]
        )
    )