Search code examples
rmatchkey-valuematching

Identify matched pairs and create a common key


I have

df<-data.frame(record_id=c("A", "B", "C", "D", "E", "F"), var1=1:6, matched.with=rev(c("A", "B", "C", "D", "E", "F")))

> df
  record_id var1 matched.with
1         A    1            F
2         B    2            E
3         C    3            D
4         D    4            C
5         E    5            B
6         F    6            A

so, id A has been matched with F, B with E, C with D

I would like to create a common.key variable for the pairs like

df.common.key
  record_id var1 matched.with common.key
1         A    1            F          1
2         B    2            E          2
3         C    3            D          3
4         D    4            C          3
5         E    5            B          2
6         F    6            A          1

Which shows that A has been matched with F (row 1) with key 1 and F has been matched with A (row 6) with also key 1. Common key does not have to be numeric, it can also be string or factor.

How can I accomplish this 1) if my data frame includes only matched pairs, 2) if my DF also includes observations without matched pairs 3) is there a tidyverse solution?


Solution

  • We can get the unique combination of letters (in the same order) using pmin and pmax. Then, we can ust the grp column to create the common.key.

    library(tidyverse)
    
    df %>% 
      group_by(grp = paste0(pmin(record_id, matched.with), pmax(record_id, matched.with))) %>% 
      mutate(common.key = cur_group_id()) %>% 
      select(-grp)
    

    Output

      grp   record_id  var1 matched.with common.key
      <chr> <chr>     <int> <chr>             <int>
    1 AF    A             1 F                     1
    2 BE    B             2 E                     2
    3 CD    C             3 D                     3
    4 CD    D             4 C                     3
    5 BE    E             5 B                     2
    6 AF    F             6 A                     1