Search code examples
pythonpandassumcalculated-columnssubtotal

How to calculate subtotals, line by line of a Data Frame in Python?


I have the following DF:

import pandas as pd
df = pd.DataFrame([['2020/01/01','A',5],['2020/01/10','B',-2],['2020/01/15','C',3],['2020/01/18','B',9]],columns=['datetime','broker','amount'])  

I need to calculate subtotals row by row of the "amount" column in new column... I already tried the code below:

df['Subtotal amount'] = df.sum(axis=1, numeric_only=True)

Which returns the result below:

     datetime broker  amount  Subtotal amount
0  2020/01/01      A       5                5
1  2020/01/10      B      -2               -2
2  2020/01/15      C       3                3
3  2020/01/18      B       9                9

But I would like the result to be calculated line by line, adding and subtracting the values ​​from the previous lines:

     datetime broker  amount  Subtotal amount
0  2020/01/01      A       5                5
1  2020/01/10      B      -2                3
2  2020/01/15      C       3                6
3  2020/01/18      B       9               15

Solution

  • As Andrej Kesely has put in the comments, you are tring to do a cummulative summation. cumsum is the numpy function you are after. This should do the trick.

    df['Subtotal amount'] = df.amount.cumsum()
    

    Credit to Andrej, I'm just moving this to an answer