Search code examples
rcombinationsparentlong-integersiblings

Combinations in long format by group


So I have a dataset of parents and their children of the following form

Children_id   Parent_id
10            1
11            1
12            1
13            2
14            2

What I want is a dataset of each child's siblings in long format, i.e.,

id   sibling_id
10   11
10   12
11   10
11   12
12   10
12   11
13   14
14   13

What's the best way to achieve this, preferably using datatable?

Example data:

df <- data.frame("Children_id" = c(10,11,12,13,14), "Parent_id" = c(1,1,1,2,2))


Solution

  • The graph experts out there will probably have better solutions, but here is a data.table solution:

    library(data.table)
    
    setDT(df)[df,on=.(Parent_id), allow.cartesian=T] %>% 
      .[Children_id!=i.Children_id, .(id = i.Children_id, sibling=Children_id)]
    

    Output:

          id sibling
       <num>   <num>
    1:    10      11
    2:    10      12
    3:    11      10
    4:    11      12
    5:    12      10
    6:    12      11
    7:    13      14
    8:    14      13