Search code examples
pythonpandascumsummultiple-conditions

Python Pandas Cumsum with reset everytime on multiple condition


I have a dataset like this :

enter image description here

I want to make a two columns of cumulative sum of reference column. But I want to make it reset with two separate condition.

For example, when I meet conditon 1, I want to reset just one column considering condition 1, but not condition 2. In that case, the other column considering only condion 2 should continue without reset.

reference_cumsum=[1,2,3,None,2,4,6,None,1,2,3]
condition=[None,None,None,'condtion1',None,None,None,'conditon2',None,None,None]
pd.DataFrame({'ref_cumsum':reference_cumsum,'event_condition':condition})

I have try a previous solution from this link https://stackoverflow.com/questions/45964740/python-pandas-cumsum-with-reset-everytime-there-is-a-0

But I don't know how to do it with multiple condition.

I have to obtain a dataframe like this :

enter image description here


Solution

  • You can use a loop with a changing groupby.cumsum on all unique conditions:

    for cond in df['event_condition'].dropna().unique():
        df[f'cumsum_{cond}'] = df.groupby(df['event_condition'].eq(cond).cumsum())['ref_cumsum'].cumsum()
    

    Note that there are two cumsum in the code. The first one is to set up the grouper, the second one is the actual cumulated sum of the values.

    Output:

        ref_cumsum event_condition  cumsum_condtion1  cumsum_conditon2
    0          1.0            None               1.0               1.0
    1          2.0            None               3.0               3.0
    2          3.0            None               6.0               6.0
    3          NaN       condtion1               NaN               NaN
    4          2.0            None               2.0               8.0
    5          4.0            None               6.0              12.0
    6          6.0            None              12.0              18.0
    7          NaN       conditon2               NaN               NaN
    8          1.0            None              13.0               1.0
    9          2.0            None              15.0               3.0
    10         3.0            None              18.0               6.0