I am trying to build a list of graph edges from a two-column data frame representing one edge per node.
pd.DataFrame({'node': ['100', '100', '200', '200', '200'],
'edge': ['111111', '222222', '123456', '456789', '987654']})
The result should look like this
pd.DataFrame({'node': ['100', '100','200', '200', '200', '200', '200', '200'],
'edge1': ['111111','222222','123456', '123456', '456789', '456789', '987654', '987654'],
'edge2': ['222222', '111111','456789', '987654', '987654', '123456' , '123456','456789']})
I have been wrestling with pivot table and stack for a while but no success.
You can use itertools.permutations
to get the permutations of the edges after groupby, then convert the output to a new df to generate the desired output:
import pandas as pd
from itertools import permutations
df = pd.DataFrame({'node': ['100', '100', '200', '200', '200'],'edge': ['111111', '222222', '123456', '456789', '987654']})
df = df.groupby('node')['edge'].apply(list).apply(lambda x:list(permutations(x, 2))).reset_index().explode('edge')
pd.DataFrame(df["edge"].to_list(), index=df['node'], columns=['edge1', 'edge2']).reset_index()
Result:
node | edge1 | edge2 | |
---|---|---|---|
0 | 100 | 111111 | 222222 |
1 | 100 | 222222 | 111111 |
2 | 200 | 123456 | 456789 |
3 | 200 | 123456 | 987654 |
4 | 200 | 456789 | 123456 |
5 | 200 | 456789 | 987654 |
6 | 200 | 987654 | 123456 |
7 | 200 | 987654 | 456789 |