Search code examples
pythonpandasgroup-by

Percentage of groupby in Pandas


I have a dataframe that has columns 'team','home_or_away','result' to store the results ('W': win or 'L': loose) for teams 'X','Y','Z' in sporting events at home ('H') or away ('A'):

df = pd.DataFrame({'team': ['X', 'X', 'X', 'X', 'Y', 'Y', 'Z', 'Z', 'Z', 'Z'],'home_or_away':['H', 'A', 'A', 'A', 'H', 'H', 'H', 'A', 'H', 'H'],'result':['W', 'W', 'W', 'L', 'W', 'L', 'W', 'L', 'L', 'L']})

I would like to generate the percentage of wins/losses per team, per event location ('A' or 'H')

I have generated a dataframe with total counts of wins/losses per team and event location, with the following groupby code:

groupedDf =df.groupby(['team', 'home_or_away','result'])[['result']].count()
print(groupedDf)

with the following output:

                          result
team home_or_away result        
X    A            L            1
                  W            2
     H            W            1
Y    H            L            1
                  W            1
Z    A            L            1
     H            L            2
                  W            1

However, I would like to have an extra column with the percentage, like so:

                          result Perc
team home_or_away result        
X    A            L            1 33.33
                  W            2 66.66
     H            W            1 100
Y    H            L            1 50
                  W            1 50
Z    A            L            1 100
     H            L            2 66.66
                  W            1 33.33

How can I do this with pandas? Thanks


Solution

  • You can use groupby.transform('sum') to get the total per team/location:

    groupedDf =df.groupby(['team', 'home_or_away','result'])[['result']].count()
    
    total = (groupedDf.groupby(level=['team', 'home_or_away'])
             ['result'].transform('sum')
             )
    
    groupedDf['Perc'] = groupedDf['result'] / total * 100
    

    Output:

                              result        Perc
    team home_or_away result                    
    X    A            L            1   33.333333
                      W            2   66.666667
         H            W            1  100.000000
    Y    H            L            1   50.000000
                      W            1   50.000000
    Z    A            L            1  100.000000
         H            L            2   66.666667
                      W            1   33.333333