How to identify unique IDs and the overlap of two datasets in R

I am working with two datasets (dataset1 and dataset2) that both consist of plenty customer emails. I would like to match identify which emails are unique in each dataset and which are "overlapping" (which are observed in both datasets). I would like to end up with 3 datasets:

one with emails unique to dataset1
one with emails unique to dataset2
one with emails that are observed in both dataset1 and dataset2 (overlap)

Here's an example for reproducability:

dataset1 <- data.frame(email = c("A", "B", "C", "D", "E" ))
dataset2 <- data.frame(email = c("X", "Y", "Z", "D", "E" ))

The result should be:

result1 consists of email "A", "B", "C"
result2 consists of email "X", "Y", "Z"
result3 consists of email "D", "E"

Thank you!

Solution

You can use %in% :

result1 <- subset(dataset1, !email %in% dataset2$email)
result1

#  email
#1     A
#2     B
#3     C

result2 <- subset(dataset2, !email %in% dataset1$email)
result2

#  email
#1     X
#2     Y
#3     Z

result3 <- subset(dataset1, email %in% dataset2$email)
result3

#  email
#4     D
#5     E