Search code examples
pythonpandas

How to subtract pd.DataFrameGroupBy-objects from each other


I have the following pd.DataFrame

match_id    player_id   round   points  A   B   C   D   E
5890    3750    1   10  0   0   0   3   1
5890    3750    2   10  0   0   0   1   0
5890    3750    3   10  0   8   0   0   1
5890    2366    1   9   0   0   0   5   0
5890    2366    2   9   0   0   0   5   0
5890    2366    3   9   0   0   0   2   0

I want to subtract the values of A, B, C, D and E of the two players and create two new columns that represent the number of points of the two players.

My desired output looks as follows:

match_id    round   points_home points_away A   B   C   D   E
5890    1   10  9   0   0   0   -2  1
5890    2   10  9   0   0   0   -4  0
5890    3   10  9   0   8   0   -2  1

Please advice


Solution

  • Use GroupBy.first with GroupBy.last first, subtract necessary columns with add points columns in concat:

    g = df.groupby(['match_id','round'])
    df1 = g.first()
    df2 = g.last()
    
    cols = ['A','B','C','D','E']
    out = pd.concat([df1['points'].rename('points_home'),
                    df2['points'].rename('points_away'), 
                    df1[cols].sub(df2[cols])], axis=1).reset_index()
    print (out)
       match_id  round  points_home  points_away  A  B  C  D  E
    0      5890      1           10            9  0  0  0 -2  1
    1      5890      2           10            9  0  0  0 -4  0
    2      5890      3           10            9  0  8  0 -2  1
    

    Alternative with MultiIndex with GroupBy.agg:

    df3 = (df.groupby(['match_id','round'])
             .agg(['first','last'])
             .rename(columns={'first':'home', 'last':'away'}))
    
    cols = ['A','B','C','D','E']
    
    out = pd.concat([df3['points'].add_prefix('points_'), 
                     df3.xs('home', axis=1, level=1)[cols]
                        .sub(df3.xs('away', axis=1, level=1)[cols])], axis=1).reset_index()
    print (out)
       match_id  round  points_home  points_away  A  B  C  D  E
    0      5890      1           10            9  0  0  0 -2  1
    1      5890      2           10            9  0  0  0 -4  0
    2      5890      3           10            9  0  8  0 -2  1