Search code examples
pythonpandasgroup-by

Looking for pandas "ungroup by" operation opposite to .groupby in the following string aggregation?


Suppose we take a pandas dataframe...

    name  age  family
0   john    1       1
1  jason   36       1
2   jane   32       1
3   jack   26       2
4  james   30       2

Then do a groupby() ...

group_df = df.groupby('family')
group_df = group_df.aggregate({'name': name_join, 'age': pd.np.mean})

Then do some aggregate/summarize operation (in my example, my function name_join aggregates the names):

def name_join(list_names, concat='-'):
    return concat.join(list_names)

The grouped summarized output is thus:

        age             name
family                      
1        23  john-jason-jane
2        28       jack-james

Question:

Is there a quick, efficient way to get to the following from the aggregated table?

    name  age  family
0   john   23       1
1  jason   23       1
2   jane   23       1
3   jack   28       2
4  james   28       2

(Note: the age column values are just examples, I don't care for the information I am losing after averaging in this specific example)


Solution

  • The rough equivalent is .reset_index(), but it may not be helpful to think of it as the "opposite" of groupby().

    You are splitting a string in to pieces, and maintaining each piece's association with 'family'. This old answer of mine does the job.

    Just set 'family' as the index column first, refer to the link above, and then reset_index() at the end to get your desired result.