Search code examples
pythonpandasloopscumsum

Python: Sum until 0 is reached and then restart


Looking for a solution in Python as discussed here R: Sum until 0 is reached and then restart

I have a similar dataframe and I want cummulative sum of values in the APpliance column when the values in Run column is 0 and reset the count to zero when the value changes from 0 to 1 in Run column

        Home   Date     Time   Appliance Run   
          2  1/21/2017  1:30:00      100    1 
          2  1/21/2017  1:45:00      207    1 
          2  1/21/2017  2:00:00      310    1       
          2  1/21/2017  2:15:00      450    1       
          2  1/21/2017  2:30:00      804    0      
          2  1/21/2017  2:45:00      556    0     
          2  1/21/2017  3:00:00      844    0    
          2  1/21/2017  3:15:00      396    0    
          2  1/21/2017  3:30:00      392    0    
          2  1/21/2017  3:45:00      1220   0    
          2  1/21/2017  4:00:00      455    1      
          2  1/21/2017  4:15:00      550    1       
          2  1/21/2017  4:30:00      600    1       
          2  1/21/2017  4:45:00      809    1       
          2  1/22/2017  3:30:00      609    1       
          2  1/22/2017  3:45:00      244    0    
          2  1/22/2017  4:00:00     1068    0    
          2  1/22/2017  4:15:00      44     0    
          2  1/22/2017  4:30:00     1240    0    
          2  1/22/2017  4:45:00      40     0    
          2  1/22/2017  5:00:00     1608    0    
          2  1/22/2017  5:15:00       0     1       
          2  1/22/2017  5:30:00       0     1       

I have tried below code. but it just adds up previous row value and does not do a cummulative sum

newcum = []
lastvalue = 0

for i, row in df.iterrows():
    if df['Run'][i+1] == 0 :
        lastvalue += x['Appliance'][i] 
    else:
        lastvalue = 0
    newcum.append(lastvalue)
df['value'] = newcum

My desired output is as follows in the value column

        Home   Date     Time   Appliance Run   value
          2  1/21/2017  1:30:00      100    1       0
          2  1/21/2017  1:45:00      207    1       0
          2  1/21/2017  2:00:00      310    1       0
          2  1/21/2017  2:15:00      450    1       0
          2  1/21/2017  2:30:00      804    0     804
          2  1/21/2017  2:45:00      556    0    1360
          2  1/21/2017  3:00:00      844    0    2204
          2  1/21/2017  3:15:00      396    0    2600
          2  1/21/2017  3:30:00      392    0    2992
          2  1/21/2017  3:45:00      1220   0    4212
          2  1/21/2017  4:00:00      455    1       0
          2  1/21/2017  4:15:00      550    1       0
          2  1/21/2017  4:30:00      600    1       0
          2  1/21/2017  4:45:00      809    1       0
          2  1/22/2017  3:30:00      609    1       0
          2  1/22/2017  3:45:00      244    0    244
          2  1/22/2017  4:00:00     1068    0    1312
          2  1/22/2017  4:15:00      44     0    1356
          2  1/22/2017  4:30:00     1240    0    2596
          2  1/22/2017  4:45:00      40     0    2636
          2  1/22/2017  5:00:00     1608    0    4244
          2  1/22/2017  5:15:00       0     1       0
          2  1/22/2017  5:30:00       0     1       0

could somebody help me with this


Solution

  • We can doing the filter for Appliance , and calculated the groupby key with cumsum , notice when the Appliance is 0 the cumsum will not increased and it had been split by the number 1

    df['new'] = df.loc[df['Run'].eq(0)].groupby(df['Run'].cumsum())['Appliance'].cumsum()
    df['new'].fillna(0,inplace=True)
    df
    Out[78]: 
        Home       Date     Time  Appliance  Run     new
    0      2  1/21/2017  1:30:00        100    1     0.0
    1      2  1/21/2017  1:45:00        207    1     0.0
    2      2  1/21/2017  2:00:00        310    1     0.0
    3      2  1/21/2017  2:15:00        450    1     0.0
    4      2  1/21/2017  2:30:00        804    0   804.0
    5      2  1/21/2017  2:45:00        556    0  1360.0
    6      2  1/21/2017  3:00:00        844    0  2204.0
    7      2  1/21/2017  3:15:00        396    0  2600.0
    8      2  1/21/2017  3:30:00        392    0  2992.0
    9      2  1/21/2017  3:45:00       1220    0  4212.0
    10     2  1/21/2017  4:00:00        455    1     0.0
    11     2  1/21/2017  4:15:00        550    1     0.0
    12     2  1/21/2017  4:30:00        600    1     0.0
    13     2  1/21/2017  4:45:00        809    1     0.0
    14     2  1/22/2017  3:30:00        609    1     0.0
    15     2  1/22/2017  3:45:00        244    0   244.0
    16     2  1/22/2017  4:00:00       1068    0  1312.0
    17     2  1/22/2017  4:15:00         44    0  1356.0
    18     2  1/22/2017  4:30:00       1240    0  2596.0
    19     2  1/22/2017  4:45:00         40    0  2636.0
    20     2  1/22/2017  5:00:00       1608    0  4244.0
    21     2  1/22/2017  5:15:00          0    1     0.0
    22     2  1/22/2017  5:30:00          0    1     0.0