Search code examples
pandaspython-3.8

Groupby id a column that contains lists


Suppose you have the following data frame:

>>> df = pd.DataFrame({'id':[0,0,1], 'lst':[["a", "b", "c"],["e"],["c"]]})
>>> df
  id         lst
0   0  [a, b, c]
1   0        [e]
2   1        [c]

And you wish to group the data by id as follows:

   id        lst
0   0  [[a, b, c], [e]]
2   1  [[c]]

Is there a way to group columns of lists, so one can obtain the results I present above?

I tried the solution posted at the following link: Group operations on Pandas column containing lists, But it didn't work for me.


Solution

  • here is one way to do it

    # groupby id, then then apply list to the resulting grouped lst items
    
    df.groupby('id')['lst'].apply(list).reset_index()
    
    
        id  lst
    0   0   [[a, b, c], [e]]
    1   1   [[c]]