I have this dataframe -
counter duration amount
0 1 0.08 1,235
1 2 0.36 1,170
2 3 1.04 1,222
3 4 0.81 1,207
4 5 3.99 1,109
5 6 1.20 1,261
6 7 4.24 1,068
7 8 3.07 1,098
8 9 2.08 1,215
9 10 4.09 1,043
10 11 2.95 1,176
11 12 3.96 1,038
12 13 3.95 1,119
13 14 3.92 1,074
14 15 3.91 1,076
15 16 1.50 1,224
16 17 3.65 962
17 18 3.85 1,039
18 19 3.82 1,062
19 20 3.34 917
I would like to create another column based on the following logic:
For each row, I want to calculate a running sum of 'duration' but it should be a running sum for the rows that are below the current row (lead and not lag). I would like to stop the calculation when the running sum reaches 5 -> when it reaches 5, I want to return the running sum for 'amount' (with the same logic).
For instance, for 'counter' 1 it should take the first 4 rows (0.08+0.36+1.04+0.81<5) and then to return 1,235+1,170+1,222+1,207=4834
for 'counter' 2 it should take only 0.36 + 1.04 + 0.81<5 and to return 1,170+1,222+1,207=3599
Will appreciate any help!
I would first go through the 2 columns once for their cumulative sums.
cum_amount = df['amount'].cumsum()
cum_duration = df['duration'].cumsum()
Get a list ready for the results
results = []
Then loop through each index (equivalent to counter)
for idx in cum_duration.index:
# keep only rows within `5` and the max. index is where the required numbers are located
wanted_idx = (cum_duration[cum_duration<5]).index.max()
# read those numbers with the wanted index
results.append({'idx': idx, 'cum_duration': cum_duration[wanted_idx], 'cum_amount': cum_amount[wanted_idx]})
# subtract the lag (we need only the leads not the lags)
cum_amount -= cum_amount[idx]
cum_duration -= cum_duration[idx]
Finally the result in a DataFrame.
pd.DataFrame(results)
idx cum_duration cum_amount
0 0 2.29 4834.0
1 1 2.21 3599.0
2 2 1.85 2429.0
3 3 4.80 2316.0
4 4 3.99 1109.0
5 5 1.20 1261.0
6 6 4.24 1068.0
7 7 3.07 1098.0
8 8 2.08 1215.0
9 9 4.09 1043.0
10 10 2.95 1176.0
11 11 3.96 1038.0
12 12 3.95 1119.0
13 13 3.92 1074.0
14 14 3.91 1076.0
15 15 1.50 1224.0
16 16 3.65 962.0
17 17 3.85 1039.0
18 18 3.82 1062.0
19 19 3.34 917.0