Search code examples
pandasdataframedictionarygroup-bypython-3.6

Pandas dataframe to dict with unique values from column as keys and nested lists as values


I'm trying to convert a dataframe to a dict, with unique values from a column (col 3) as keys.

From this:

  Col1   Col2   Col3
0  a       b      x
1  c       d      x
2  e       f      y
3  g       h      y

To this:

{x:[[a,b][c,d]],y:[[e,f],[g,h]]}

With the following code i get tuples, which really doesn't do the trick for me.

new_dict = df.groupby('col3').apply(lambda x: list(zip(x['col1'],x['col2']))).to_dict()

Output:

{x:[(a,b),(c,d)],y:[(e,f),(g,h)]}

Solution

  • Use map to list or list comprehension:

    new_dict = (df.groupby('col3')
                  .apply(lambda x: list(map(list, zip(x['col1'],x['col2']))))
                  .to_dict())
    print (new_dict)
    {'x': [['a', 'b'], ['c', 'd']], 'y': [['e', 'f'], ['g', 'h']]}
    

    new_dict = (df.groupby('col3')
                  .apply(lambda x: [list(y) for y in zip(x['col1'],x['col2'])])
                  .to_dict())
    

    Another solution is convert each group to 2d array and convert to list:

    new_dict = df.groupby('col3')['col1','col2'].apply(lambda x: x.values.tolist()).to_dict()