Search code examples
runiqueintersect

Why does unique command act like intersect command?


I have checked the questions but I couldn't find any explanation.

So I have two vectors and I only want to choose the different elements that one has and the other hasn't.

I defined to vectors hypothetically like:

hypo1=c("a01","a02","a03","a04","b01","b02","b03","b04","c01","c02","c03","c04")
hypo2=c("a03","a04","b01","b02","c02","c03")

And then wanted to see the duplicates in these two vectors.

intersect(hypo1,hypo2)
[1] "a03" "a04" "b01" "b02" "c02" "c03"

which seems like working fine.

However, when I wanted to see the unique elements which hypo1 has and hypo2 hasn't, I got the all elements restored in the first vector. such as:

unique(hypo1,hypo2)
 [1] "a01" "a02" "a03" "a04" "b01" "b02" "b03" "b04" "c01" "c02" "c03" "c04"

Then I've changed the order of the vectors I've created and it gave the intersect command result, like

unique(hypo2,hypo1)
[1] "a03" "a04" "b01" "b02" "c02" "c03"

I did some digging on the web but I couldn't find what I'm missing. I need to find unique elements which one data has and other hasn't.


Solution

  • You want setdiff(hypo2, hypo1). unique(hypo2, hypo1) means something completely different: it means you want the unique entries in hypo2, but will allow values to be duplicated if they are listed in hypo1. This is explained on the help page ?unique.

    For example,

    unique(c(1,2,2,3,3,4,4,4), c(3,4))
    

    gives

    [1] 1 2 3 3 4 4 4
    

    because 3 and 4 have been declared to be "incomparables". On the other hand,

    setdiff(c(1,2,2,3,3,4,4,4), c(3,4))
    

    gives

    [1] 1 2
    

    which is what I think you were looking for.