I have
df<-data.frame(record_id=c("A", "B", "C", "D", "E", "F"), var1=1:6, matched.with=rev(c("A", "B", "C", "D", "E", "F")))
> df
record_id var1 matched.with
1 A 1 F
2 B 2 E
3 C 3 D
4 D 4 C
5 E 5 B
6 F 6 A
so, id A
has been matched with F
, B
with E
, C
with D
I would like to create a common.key
variable for the pairs like
df.common.key
record_id var1 matched.with common.key
1 A 1 F 1
2 B 2 E 2
3 C 3 D 3
4 D 4 C 3
5 E 5 B 2
6 F 6 A 1
Which shows that A
has been matched with F
(row 1) with key 1
and F
has been matched with A
(row 6) with also key 1
. Common key does not have to be numeric, it can also be string or factor.
How can I accomplish this 1) if my data frame includes only matched pairs, 2) if my DF also includes observations without matched pairs 3) is there a tidyverse solution?
We can get the unique combination of letters (in the same order) using pmin
and pmax
. Then, we can ust the grp
column to create the common.key
.
library(tidyverse)
df %>%
group_by(grp = paste0(pmin(record_id, matched.with), pmax(record_id, matched.with))) %>%
mutate(common.key = cur_group_id()) %>%
select(-grp)
Output
grp record_id var1 matched.with common.key
<chr> <chr> <int> <chr> <int>
1 AF A 1 F 1
2 BE B 2 E 2
3 CD C 3 D 3
4 CD D 4 C 3
5 BE E 5 B 2
6 AF F 6 A 1