Search code examples
rfilterdplyracross

filter row when multiple columns can be concerned


I have this data:

# A tibble: 20 x 6
      ID style param1 param2 param3 param4
   <dbl> <chr> <chr>  <chr>  <chr>  <chr> 
 1     1 ar    R78    NA     NA     NA    
 2     2 bg    NA     NA     NA     NA    
 3     3 bh    NA     NA     NA     NA    
 4     4 ar    NA     R78    NA     NA    
 5     5 bg    NA     NA     NA     NA    
 6     6 bh    NA     NA     NA     NA    
 7     7 ar    R78    NA     NA     NA    
 8     8 bg    NA     NA     R78    NA    
 9     9 bh    NA     NA     NA     NA    
10    10 ar    NA     R78    NA     NA    
11    11 bg    NA     NA     NA     NA    
12    12 bh    NA     NA     R78    NA    
13    13 ar    NA     NA     NA     NA    
14    14 bg    R78    NA     NA     NA    
15    15 bh    NA     NA     NA     NA    
16    16 ar    NA     NA     NA     NA    
17    17 bg    NA     NA     NA     NA    
18    18 bh    R78    NA     NA     NA    
19    19 ar    NA     NA     NA     R78   
20    20 bg    NA     NA     NA     NA 

I want to use dplyr::filter to select rows when R78 is in the column param1, param2, param3 or param4

I try:

data %>%
  filter(across(param1:param4) == "R78")

which returns me:

# A tibble: 4 x 6
     ID style param1 param2 param3 param4
  <dbl> <chr> <chr>  <chr>  <chr>  <chr> 
1     1 ar    R78    NA     NA     NA    
2     7 ar    R78    NA     NA     NA    
3    14 bg    R78    NA     NA     NA    
4    18 bh    R78    NA     NA     NA  

This is the same as when i do data %>% filter(param1 == "R78")

...

Maybe i misuse the "across" function. I've tried with multiples "|" but never work :/

What i expect to my code is it must return me a tibble with the row 1, 4, 7, 10, 12, 14; 18 and 19 only :/

Thnaks to you !


Solution

  • across works column-wise. In such cases I think it is better to use filter_at :

    library(dplyr)
    df %>% filter_at(vars(param1:param4), any_vars(. == 'R78'))
    
    #   ID style param1 param2 param3 param4
    #1   1    ar    R78   <NA>   <NA>   <NA>
    #4   4    ar   <NA>    R78   <NA>   <NA>
    #7   7    ar    R78   <NA>   <NA>   <NA>
    #8   8    bg   <NA>   <NA>    R78   <NA>
    #10 10    ar   <NA>    R78   <NA>   <NA>
    #12 12    bh   <NA>   <NA>    R78   <NA>
    #14 14    bg    R78   <NA>   <NA>   <NA>
    #18 18    bh    R78   <NA>   <NA>   <NA>
    #19 19    ar   <NA>   <NA>   <NA>    R78
    

    A hack to make across work is to use Reduce :

    df %>% filter(Reduce(`|`, across(param1:param4, ~. == 'R78')))
    

    In base R, you can use rowSums :

    cols <- paste0('param', 1:4)
    df[rowSums(df[cols] == 'R78', na.rm = TRUE) > 0, ]