Search code examples
pythonpandaspandas-groupbyresetcumsum

How to reset a groupby cumsum when a condition in another column is met?


I'm trying to reset cumsum to zero everytime a condition in another column is met. Here's how my dataframe looks like:

ticker    cashflow op    shares profit
CPFE3.SA  -220.06  Buy    8          0
CPFE3.SA   249.93  Sell   0      29.87
CPFE3.SA   -90.03  Buy    3          0  
CPFE3.SA  -575.87  Buy    22         0
CPFE3.SA   -92.12  Buy    25         0
CPFE3.SA   763.52  Sell   0      35.37
DIS       -921.46  Buy    1          0

I'm using the following to create the profit column:

df['profit'] = df.groupby(df.ticker)['cashflow'].cumsum().where(df.op=='Sell',0)

Shares is the number of shares I own after the buying/selling. Profit is the one that I want to make something a little bit different. How do I make the cumsum reset everytime it encounters a 0 in the shares column? I tried some stuff but I suck at this.

The result I'm looking for would look like this:

ticker    cashflow op    shares profit
CPFE3.SA  -220.06  Buy    8          0
CPFE3.SA   249.93  Sell   0      29.87
CPFE3.SA   -90.03  Buy    3          0  
CPFE3.SA  -575.87  Buy    22         0
CPFE3.SA   -92.12  Buy    25         0
CPFE3.SA   763.52  Sell   0        5.5
DIS       -921.46  Buy    1          0

Thanks in advance and please let me know if I can provide more info.


Solution

  • you can detect the block with cumsum:

    #print to see what it does
    blocks = df['shares'].eq(0)[::-1].cumsum()[::-1]
    
    df['profit'] =(df.groupby(['ticker', blocks])
       ['cashflow'].cumsum()
       .where(df['shares'].eq(0),0)
    )