Search code examples
rdataframedplyrsubset

Find rows where multiple columns have the same value


library(tidyverse)
d = data.frame(x=c('A','B','C'), y=c('A','B','D'), z=c('X','B','C'), a=1:3)
print(d)
  x y z a
1 A A X 1
2 B B B 2
3 C D C 3
d %>% filter(x==y) # Returns rows 1 and 2
d %>% filter(x==z) # Returns rows 2 and 3
d %>% filter(x==y & x==z) # Returns row 2

How can I do what the very last line is doing with more concise syntax for some arbitrary set of columns? For example, filter(all.equal(x,y,z)) which doesn't work but expresses the idea.


Solution

  • With comparisons, on multiple columns, an easier option is to take one column out (x), while keeping the rest by looping in if_all, then do the ==, so that it will return only TRUE when all the comparisons for that particular row is TRUE

    library(dplyr)
    d %>% 
      filter(if_all(y:z, ~ x == .x))