Search code examples
pandasaggregation

Multi index series into pandas dataframe


I have a pandas series with two indices:

df_agg=df.groupby(['yearID','teamID']).sum()['Salary']
df_agg.head()

yearID  teamID
1985    ATL       14807000
        BAL       11560712
        BOS       10897560
        CAL       14427894
        CHA        9846178

I want to convert int into a pandas data frame like

yearID     teamID    Salary
1985        ATL       14807000
1985        BAL       11560712
1985        BOS       10897560
1985        CAL       14427894
1985        CHA        9846178

I tired to use:

df_new=df_agg.reset_index(inplace=True)

but I got the following error:


TypeError Traceback (most recent call last) in () ----> 1 df_new=df_agg.reset_index(inplace=True)

C:\Users\ameimand\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\series.py in reset_index(self, level, drop, name, inplace) 966

    index=new_index).__finalize__(self)
    967         elif inplace:
--> 968             raise TypeError('Cannot reset_index inplace on a Series '
    969                             'to create a DataFrame')
    970         else:

TypeError: Cannot reset_index inplace on a Series to create a DataFrame

Solution

  • I think there are 2 nice solutions parameter as_index=False:

    df_new = df.groupby(['yearID','teamID'], as_index=False)['Salary'].sum()
    

    Or reset_index without inplace=True:

    df_new = df.groupby(['yearID','teamID'])['Salary'].sum().reset_index()
    

    Notice:

    Better is specify column for aggregation after groupby in [] like ['Salary']:

    df.groupby(['yearID','teamID'], as_index=False)['Salary']
    

    as:

    df.groupby(['yearID','teamID']).sum()['Salary']
    

    because this aggregate all columns and then select only Salary.