I have a dataframe with columns "V1" and "V2".
Z<- c('931', '907','905','902','8552','855','8542','854','8532','853','852','851','850')
I want to add a new variable "Match" to the dataframe which takes the values 1
, 2
, or 3
if the following conditions satisfies:
Match=1
, if value in V1
and V2
are same
Match=2
, if value in both V1
and V2
contain any of the values in vector Z
Match=3
, if value in V1
or V2
contain any values other than the values in vector Z
The resulting dataframe should have the values as given in column Match
.
V1 V2 Match
8552 689 3
576 8552 3
8552 907 2
8552 85 3
8552 902 2
8552 783 3
931 367 3
8552 1090 3
8552 905 2
8552 8552 1
8552 1004 3
113 907 3
8552 1001 3
8542 564 3
850 720 3
you can use a case_when statement from the {dplyr} package. like so:
df %>% mutate(Match = case_when(V1 == V2 ~ 1,
V1 %in% Z & V2 %in% Z ~ 2,
!(V1 %in% Z) | !(V2 %in% Z) ~ 3)