Search code examples
rduplicates

Remove duplicates across multiple vectors


I want to remove all duplicates across multiple vectors, leaving none. For example, for these vectors:

a <- c("dog", "fish", "cow")
b <- c("dog", "horse", "mouse")
c <- c("cat", "sheep", "mouse")

the expected result would be:

a <- c("fish", "cow")
b <- c("horse")
c <- c("cat", "sheep")

Is there a way to achieve this without concatenating the vectors and splitting them again?


Solution

  • You could perhaps do:

    vec <- c(a, b, c)
    sapply(list(a, b, c), function(x) x[!x %in% vec[duplicated(vec)]])
    
    [[1]]
    [1] "fish" "cow" 
    
    [[2]]
    [1] "horse"
    
    [[3]]
    [1] "cat"   "sheep"
    

    If you need individual variables in the global environment, with the addition of lst() from tibble:

    vec <- c(a, b, c)
    l <- sapply(lst(a, b, c), function(x) x[!x %in% vec[duplicated(vec)]])
    list2env(l, envir = .GlobalEnv)