Search code examples
pandascumsum

Restart cumsum in Pandas with condition


I have columns amount & assets. Column target should be the cumsum of amount, but the sum should be reset to the current amount if the previous assets was equal to zero.

Sample:

+--------+--------+--------+
| amount | assets | target |
+--------+--------+--------+
|      6 |     10 |      6 |
|      8 |     20 |     14 |
|     -1 |      0 |     13 |
|      6 |      1 |      6 |
|     -7 |      0 |     -1 |
|      2 |      4 |      2 |
|     -5 |      7 |     -3 |
|      3 |      9 |      0 |
|      7 |      0 |      7 |
|      9 |      2 |      9 |
|      1 |      3 |     10 |
|     -4 |      5 |      6 |
+--------+--------+--------+

Solution

  • Use GroupBy.cumsum with groups created by compare column by 0 with shifting Series.shift, processing first NaN and Series.cumsum:

    g = df['assets'].eq(0).shift().bfill().cumsum()
    #alternative
    #g = df['assets'].eq(0).shift(fill_value=0).cumsum()
    df['new'] = df.groupby(g)['amount'].cumsum()
    
    print (df)
        amount  assets  target  new
    0        6      10       6    6
    1        8      20      14   14
    2       -1       0      13   13
    3        6       1       6    6
    4       -7       0      -1   -1
    5        2       4       2    2
    6       -5       7      -3   -3
    7        3       9       0    0
    8        7       0       7    7
    9        9       2       9    9
    10       1       3      10   10
    11      -4       5       6    6