Search code examples
rdplyrtidyverse

Filter across columns with equal values


I'd like to filter a dataframe if at least two columns of the df are equal, but in a dynamic way.
Suppose the following dataframe :

data.frame(Var1 = c(1,1,1,2,3,4,4),Var2=c(1,2,6,8,2,5,4),Var3=c(1,3,5,6,7,5,6))
  Var1 Var2 Var3
1    1    1    1
2    1    2    3
3    1    6    5
4    2    8    6
5    3    2    7
6    4    5    5
7    4    4    6

I'd like to keep values that are not equal at all, so if at least 2 columns are equal by row, they will be deleted. The final result should be :

  Var1 Var2 Var3
2    1    2    3
3    1    6    5
4    2    8    6
5    3    2    7

As the number of columns in my dataframe should be incremented, it would be better if it is something dynamic, inspired by the mutate(across()) statement.

Thank you very much


Solution

  • library(tidyverse) 
    
    df %>% 
      rowwise() %>% 
      filter(n_distinct(c_across()) > 2)
    
    # A tibble: 4 × 3
    # Rowwise: 
       Var1  Var2  Var3
      <dbl> <dbl> <dbl>
    1     1     2     3
    2     1     6     5
    3     2     8     6
    4     3     2     7