Search code examples
rlinkage

How to pull ID's that are interlinked?


I have two columns id1 and id2 which have interconnected list of id's. I am looking for a solution in R that can pull those id's together based on their relationship. The basic idea is that the id's 313-320 are all interlinked, e.g. 313 linked to 314 (row1) and because 314 is linked to 316 (row 7) so 313 and 316 are also linked and so on. The solution somehow has to explore these linkages and put them together in a chain such that 313-320 will be in one and 321-328 will be in a second one.

id1<-c(313,313,313,313,313,314,314,314,314,315,317,317,317,318,318,319,321,321,321,321,321,321,321,322,322,322,322,322, 322,323,323,323,323,323,324,324,324,324,325,325,325,326,326,327)

id2<-c(314,315,316,319,320,315,316,319,320,316,318,319,320,319,320,320,322,323,324,325,326,327,328,323,324,325,326,327, 328,324,325,326,327,328,325,326,327,328,326,327,328,327,328,328)

df<-cbind.data.frame(id1, id2)

> df
   id1 id2
1  313 314
2  313 315
3  313 316
4  313 319
5  313 320
6  314 315
7  314 316
8  314 319
9  314 320
10 315 316
11 317 318
12 317 319
13 317 320
14 318 319
15 318 320
16 319 320
17 321 322
18 321 323
19 321 324
20 321 325
21 321 326
22 321 327
23 321 328
24 322 323
25 322 324
26 322 325
27 322 326
28 322 327
29 322 328
30 323 324
31 323 325
32 323 326
33 323 327
34 323 328
35 324 325
36 324 326
37 324 327
38 324 328
39 325 326
40 325 327
41 325 328
42 326 327
43 326 328
44 327 328

Solution

  • As pointed out by @r2evans, this problem could be solved by igraph library:

    clusters(graph_from_data_frame(df, directed = FALSE))
    
    $membership
    313 314 315 317 318 319 321 322 323 324 325 326 327 316 320 328 
      1   1   1   1   1   1   2   2   2   2   2   2   2   1   1   2 
    
    $csize
    [1] 8 8
    
    $no
    [1] 2