Search code examples
rdataframeduplicatessubsetconditional-formatting

How to find duplicates in two columns only while another column is different


I want to find where two (or more) rows have the same x,y (location) but a different ID.

In the table below I would like to know about the last two rows only.

x y id
1 2 1
1 2 1
1 3 4
2 3 1
2 3 2
# example data
x <- read.table(text = "x   y   id
1   2   1
1   2   1
1   3   4
2   3   1
2   3   2", header = TRUE)

Solution

  • Another way, using dplyr:

    x %>% 
      group_by(x, y) %>% 
      filter(n_distinct(id) > 1)
    
    # A tibble: 2 x 3
    # Groups:   x, y [1]
          x     y    id
      <int> <int> <int>
    1     2     3     1
    2     2     3     2