Search code examples
pythonpandaslistdataframefrozenset

pandas dataframe to frozenset based on conditions


I have a dataset like:

 node    community
  1         2
  2         4
  3         5
  4         2
  5         3
  7         1
  8         3
  10        4
  12        5

I want to have the frozenset of node column in a way that their community is the same. Thus, the expected result is something like:

 [frozenset([1,4]), frozenset([2,10]), frozenset([3,12]),frozenset([5,8]),frozenset([1])]

Is there any way that I can do it without changing dataframe to a list of list. Thanks.


Solution

  • Using GroupBy + apply with frozenset:

    res = df.groupby('community')['node'].apply(frozenset).values.tolist()
    
    print(res)
    
    [frozenset({7}), frozenset({1, 4}), frozenset({8, 5}),
     frozenset({2, 10}), frozenset({3, 12})]