Looking for a solution in Python as discussed here R: Sum until 0 is reached and then restart
I have a similar dataframe and I want cummulative sum of values in the APpliance column when the values in Run column is 0 and reset the count to zero when the value changes from 0 to 1 in Run column
Home Date Time Appliance Run
2 1/21/2017 1:30:00 100 1
2 1/21/2017 1:45:00 207 1
2 1/21/2017 2:00:00 310 1
2 1/21/2017 2:15:00 450 1
2 1/21/2017 2:30:00 804 0
2 1/21/2017 2:45:00 556 0
2 1/21/2017 3:00:00 844 0
2 1/21/2017 3:15:00 396 0
2 1/21/2017 3:30:00 392 0
2 1/21/2017 3:45:00 1220 0
2 1/21/2017 4:00:00 455 1
2 1/21/2017 4:15:00 550 1
2 1/21/2017 4:30:00 600 1
2 1/21/2017 4:45:00 809 1
2 1/22/2017 3:30:00 609 1
2 1/22/2017 3:45:00 244 0
2 1/22/2017 4:00:00 1068 0
2 1/22/2017 4:15:00 44 0
2 1/22/2017 4:30:00 1240 0
2 1/22/2017 4:45:00 40 0
2 1/22/2017 5:00:00 1608 0
2 1/22/2017 5:15:00 0 1
2 1/22/2017 5:30:00 0 1
I have tried below code. but it just adds up previous row value and does not do a cummulative sum
newcum = []
lastvalue = 0
for i, row in df.iterrows():
if df['Run'][i+1] == 0 :
lastvalue += x['Appliance'][i]
else:
lastvalue = 0
newcum.append(lastvalue)
df['value'] = newcum
My desired output is as follows in the value column
Home Date Time Appliance Run value
2 1/21/2017 1:30:00 100 1 0
2 1/21/2017 1:45:00 207 1 0
2 1/21/2017 2:00:00 310 1 0
2 1/21/2017 2:15:00 450 1 0
2 1/21/2017 2:30:00 804 0 804
2 1/21/2017 2:45:00 556 0 1360
2 1/21/2017 3:00:00 844 0 2204
2 1/21/2017 3:15:00 396 0 2600
2 1/21/2017 3:30:00 392 0 2992
2 1/21/2017 3:45:00 1220 0 4212
2 1/21/2017 4:00:00 455 1 0
2 1/21/2017 4:15:00 550 1 0
2 1/21/2017 4:30:00 600 1 0
2 1/21/2017 4:45:00 809 1 0
2 1/22/2017 3:30:00 609 1 0
2 1/22/2017 3:45:00 244 0 244
2 1/22/2017 4:00:00 1068 0 1312
2 1/22/2017 4:15:00 44 0 1356
2 1/22/2017 4:30:00 1240 0 2596
2 1/22/2017 4:45:00 40 0 2636
2 1/22/2017 5:00:00 1608 0 4244
2 1/22/2017 5:15:00 0 1 0
2 1/22/2017 5:30:00 0 1 0
could somebody help me with this
We can doing the filter for Appliance , and calculated the groupby
key with cumsum
, notice when the Appliance is 0 the cumsum will not increased and it had been split by the number 1
df['new'] = df.loc[df['Run'].eq(0)].groupby(df['Run'].cumsum())['Appliance'].cumsum()
df['new'].fillna(0,inplace=True)
df
Out[78]:
Home Date Time Appliance Run new
0 2 1/21/2017 1:30:00 100 1 0.0
1 2 1/21/2017 1:45:00 207 1 0.0
2 2 1/21/2017 2:00:00 310 1 0.0
3 2 1/21/2017 2:15:00 450 1 0.0
4 2 1/21/2017 2:30:00 804 0 804.0
5 2 1/21/2017 2:45:00 556 0 1360.0
6 2 1/21/2017 3:00:00 844 0 2204.0
7 2 1/21/2017 3:15:00 396 0 2600.0
8 2 1/21/2017 3:30:00 392 0 2992.0
9 2 1/21/2017 3:45:00 1220 0 4212.0
10 2 1/21/2017 4:00:00 455 1 0.0
11 2 1/21/2017 4:15:00 550 1 0.0
12 2 1/21/2017 4:30:00 600 1 0.0
13 2 1/21/2017 4:45:00 809 1 0.0
14 2 1/22/2017 3:30:00 609 1 0.0
15 2 1/22/2017 3:45:00 244 0 244.0
16 2 1/22/2017 4:00:00 1068 0 1312.0
17 2 1/22/2017 4:15:00 44 0 1356.0
18 2 1/22/2017 4:30:00 1240 0 2596.0
19 2 1/22/2017 4:45:00 40 0 2636.0
20 2 1/22/2017 5:00:00 1608 0 4244.0
21 2 1/22/2017 5:15:00 0 1 0.0
22 2 1/22/2017 5:30:00 0 1 0.0