I am working with two datasets (dataset1 and dataset2) that both consist of plenty customer emails. I would like to match identify which emails are unique in each dataset and which are "overlapping" (which are observed in both datasets). I would like to end up with 3 datasets:
Here's an example for reproducability:
dataset1 <- data.frame(email = c("A", "B", "C", "D", "E" ))
dataset2 <- data.frame(email = c("X", "Y", "Z", "D", "E" ))
The result should be:
Thank you!
You can use %in%
:
result1 <- subset(dataset1, !email %in% dataset2$email)
result1
# email
#1 A
#2 B
#3 C
result2 <- subset(dataset2, !email %in% dataset1$email)
result2
# email
#1 X
#2 Y
#3 Z
result3 <- subset(dataset1, email %in% dataset2$email)
result3
# email
#4 D
#5 E