Search code examples
rduplicatesunique

How to remove duplicate values in one column by ID in R


How to remove duplicate by ID. For each ID, the column Drug must have unique values. Any help is appreciated.

    dat <- read.table(text="Id  Drug    
A   Meropenem     
A   Ampicillin  
A   Augmentin 
A   Meropenem     
A   Ampicillin  
A   Augmentin
B   Meropenem     
B   Ampicillin    
B   Augmentin", header=TRUE)



This is the desired output: 
 
dat.desired <- read.table(text="Id  Drug
A   Meropenem     
A   Ampicillin  
A   Augmentin 
B   Meropenem     
B   Ampicillin    
B   Augmentin", header=TRUE)

Solution

  • Using the group_by in dplyr allows remove the duplicates per group only.

    library(dplyr)
    dat %>% group_by(Id) %>% filter( !duplicated(Drug))
    
      Id    Drug      
      <chr> <chr>     
    1 A     Meropenem 
    2 A     Ampicillin
    3 A     Augmentin 
    4 B     Meropenem 
    5 B     Ampicillin
    6 B     Augmentin