Search code examples
pythonpython-3.xpandasdataframegroup-by

Subtraction and division of columns on a pandas groupby object


I have a pandas DataFrame:

  Name  Col_1  Col_2 Col_3 
0     A    3     5    5
1     B    1     6    7
2     C    3     7    4
3     D    5     8    3

I need to create a Series object with the values of (Col_1-Col_2)/Col_3 using groupby, so basically this:

Name
A   (3-5)/5
B   (1-6)/7
C   (3-7)/4
D   (5-8)/3

Repeated names are a possiblty, hence the groupby usage. for example:

  Name  Col_1  Col_2 Col_3 
0     A    3     5    5
1     B    1     6    7
2     B    3     6    7

The expected result:

Name
A   (3-5)/5
B   ((1+3)-6)/7

I Created a groupby object:

df.groupby['Name']

but it seems like no groupby method fits the bill for what I'm trying to do. How can I tackle this matter?


Solution

  • Let's try:

    g = df.groupby('Name')
    
    out = (g['Col_1'].sum()-g['Col_2'].first()).div(g['Col_3'].first())
    

    Or:

    (df.groupby('Name')
       .apply(lambda g: (g['Col_1'].sum()-g['Col_2'].iloc[0])/g['Col_3'].iloc[0])
    )
    

    Output:

    Name
    A   -0.400000
    B   -0.285714
    dtype: float64