Search code examples
pythonpandascumsum

pandas restart cumsum every time the value is zero


so I have a series, I want to cumsum, but start over every time I hit a 0, somthing like this:

orig wanted result
0 0 0
1 1 1
2 1 2
3 1 3
4 1 4
5 1 5
6 1 6
7 0 0
8 1 1
9 1 2
10 1 3
11 0 0
12 1 1
13 1 2
14 1 3
15 1 4
16 1 5
17 1 6

any ideas? (pandas, pure python, other)


Solution

  • Use df['orig'].eq(0).cumsum() to generate groups starting on each 0, then cumcount to get the increasing values:

    df['result'] = df.groupby(df['orig'].eq(0).cumsum()).cumcount()
    

    output:

        orig  wanted result  result
    0      0              0       0
    1      1              1       1
    2      1              2       2
    3      1              3       3
    4      1              4       4
    5      1              5       5
    6      1              6       6
    7      0              0       0
    8      1              1       1
    9      1              2       2
    10     1              3       3
    11     0              0       0
    12     1              1       1
    13     1              2       2
    14     1              3       3
    15     1              4       4
    16     1              5       5
    17     1              6       6
    

    Intermediate:

    df['orig'].eq(0).cumsum()
    
    0     1
    1     1
    2     1
    3     1
    4     1
    5     1
    6     1
    7     2
    8     2
    9     2
    10    2
    11    3
    12    3
    13    3
    14    3
    15    3
    16    3
    17    3
    Name: orig, dtype: int64