I have a dataframe with a column that consists of datetime values, one that consists of speed values, and one that consists of timedelta values between the rows.
I would want to get the cumulative sum of timedeltas whenever the speed is below 2 knots. When the speed rises above 2 knots, I would like this cumulative sum to reset to 0, and then to start summing at the next instance of speed observations below 2 knots.
I have started by flagging all observations of speed values < 2. I only manage to get the cumulative sum for all of the observations with speed < 2, but not a cumulative sum separated for each instance.
The dataframe looks like this, and cum_sum
is the desired output:
datetime speed timedelta cum_sum flag
1-1-2019 19:30:00 0.5 0 0 1
1-1-2019 19:32:00 0.7 2 2 1
1-1-2019 19:34:00 0.1 2 4 1
1-1-2019 19:36:00 5.0 2 0 0
1-1-2019 19:38:00 25.0 2 0 0
1-1-2019 19:42:00 0.1 4 4 1
1-1-2019 19:49:00 0.1 7 11 1
Note: Uses global variable
c = 0
def fun(x):
global c
if x['speed'] > 2.0:
c = 0
else:
c = x['timedelta']+c
return c
df = pd.DataFrame( {'datetime': ['1-1-2019 19:30:00']*7,
'speed': [0.5,.7,0.1,5.0,25.0,0.1,0.1], 'timedelta': [0,2,2,2,2,4,7]})
df['cum_sum']=df.apply(fun, axis=1)
datetime speed timedelta cum_sum
0 1-1-2019 19:30:00 0.5 0 0
1 1-1-2019 19:30:00 0.7 2 2
2 1-1-2019 19:30:00 0.1 2 4
3 1-1-2019 19:30:00 5.0 2 0
4 1-1-2019 19:30:00 25.0 2 0
5 1-1-2019 19:30:00 0.1 4 4
6 1-1-2019 19:30:00 0.1 7 11