Search code examples
pythonpandasdataframepandas-groupbymulti-index

Pandas: how to sum by groupby value


Using this:

ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
         'Kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
         'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
         'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)

df.groupby(['Team',"Rank"]).sum()

This is returned.

             Points
Team   Rank        
Devils 2        863
       3        673
Kings  1       1544
       3        741
       4        812
Riders 1        876
       2       2173
Royals 1        804
       4        701

How you I extract values (Points) where rank equals '1', so 1544+ 876+ 804. and the same for rank equals 2, and 3.


Solution

  • I believe need DataFrame.xs:

    print (df.xs(1, level=1))
    
            Points
    Team          
    Kings     1544
    Riders     876
    Royals     804
    
    print (df.xs(2, level=1))
    
            Points
    Team          
    Devils     863
    Riders    2173
    

    For select by multiple criteria use slicers:

    idx = pd.IndexSlice
    print (df.loc[idx[:, [1,2]], :])
    
                 Points
    Team   Rank        
    Devils 2        863
    Kings  1       1544
    Riders 1        876
           2       2173
    Royals 1        804
    

    print (df.loc[idx['Riders', [1,2]], :])
    
                 Points
    Team   Rank        
    Riders 1        876
           2       2173
    

    If want sum all groups by Ranks change grouping columns from ['Team',"Rank"] to Rank:

    s = df.groupby("Rank")['Points'].sum()
    print (s)
    Rank
    1    3224
    2    3036
    3    1414
    4    1513
    Name: Points, dtype: int64
    

    If need also df1 then use sum per level=1:

    df1 = df.groupby(['Team',"Rank"]).sum()
    print (df1)
                 Points
    Team   Rank        
    Devils 2        863
           3        673
    Kings  1       1544
           3        741
           4        812
    Riders 1        876
           2       2173
    Royals 1        804
           4        701
    
    s1 = df1.sum(level=1)
    print (s1)
          Points
    Rank        
    2       3036
    3       1414
    1       3224
    4       1513