Search code examples
pythonpandasdataframegrouping

Group by a column and get a list of dict of rows in the group


in a dataframe like this:

   col1  col2  col3
0     1     3     5
1     1     3     6
2     2     4     5
3     2     4     6

I want to group by col1 and get a list of dict's from the other two coluns like this:

   col1  rows_group  
0     1  [{"col_2": 3, "col_3": 5}, {"col_2": 3, "col_3": 6}]
1     2  [{"col_2": 4, "col_3": 5}, {"col_2": 4, "col_3": 6}]

how can i achieve this?


Solution

  • I finally found how to achieve this with groupby.apply and to_dict:

    df.groupby('col1').apply(
        lambda x: x[['col2', 'col3']].to_dict('records')
    ).reset_index(name='rows_group')
    

    Alternatively:

    (df.set_index('col1').groupby(level=0)
       .apply(lambda g: g.to_dict('records'))
       .reset_index(name='rows_group')
    )
    

    Output:

       col1                                        rows_group
    0     1  [{'col2': 3, 'col3': 5}, {'col2': 3, 'col3': 6}]
    1     2  [{'col2': 4, 'col3': 5}, {'col2': 4, 'col3': 6}]