So I have a dataset of parents and their children of the following form
Children_id Parent_id
10 1
11 1
12 1
13 2
14 2
What I want is a dataset of each child's siblings in long format, i.e.,
id sibling_id
10 11
10 12
11 10
11 12
12 10
12 11
13 14
14 13
What's the best way to achieve this, preferably using datatable?
Example data:
df <- data.frame("Children_id" = c(10,11,12,13,14), "Parent_id" = c(1,1,1,2,2))
The graph experts out there will probably have better solutions, but here is a data.table
solution:
library(data.table)
setDT(df)[df,on=.(Parent_id), allow.cartesian=T] %>%
.[Children_id!=i.Children_id, .(id = i.Children_id, sibling=Children_id)]
Output:
id sibling
<num> <num>
1: 10 11
2: 10 12
3: 11 10
4: 11 12
5: 12 10
6: 12 11
7: 13 14
8: 14 13