I have a pandas series that looks like this:
import numpy as np
import string
import pandas as pd
np.random.seed(0)
data = np.random.randint(1,6,10)
index = list(string.ascii_lowercase)[:10]
a = pd.Series(data=data,index=index,name='apple')
a
>>>
a 5
b 1
c 4
d 4
e 4
f 2
g 4
h 3
i 5
j 1
Name: apple, dtype: int32
I want to group the series by its values and return a dict of of list of indices for those values i.e. this result:
{1: ['b', 'j'], 2: ['f'], 3: ['h'], 4: ['c', 'd', 'e', 'g'], 5: ['a', 'i']}
Here is how I achieve that at the moment:
b = a.reset_index().set_index('apple').squeeze()
grouped = b.groupby(level=0).apply(list).to_dict()
grouped
>>>
{1: ['b', 'j'], 2: ['f'], 3: ['h'], 4: ['c', 'd', 'e', 'g'], 5: ['a', 'i']}
However, it does not feel particularly pythonic to explicitly transform the series first so that I can get to the result. Is there a way to do this directly by applying a single function (ideally) or combination of functions in one line to achieve the same result?
Thanks!
You can use the groupby
function and apply a lambda expression to it in order to get the desired result in one line:
grouped = a.groupby(a.values).apply(lambda x: list(x.index)).to_dict()
Alternatively, you could use the following:
grouped = dict(a.groupby(a.values).apply(lambda x: x.index.get_level_values(0)))
grouped = dict(a.groupby(a.values).apply(lambda x: x.index.tolist()))