I have this Dataframe
import pandas as pd
import math
from pandas import Timestamp
Date = [Timestamp('2024-03-16 23:59:42'), Timestamp('2024-03-16 23:59:42'), Timestamp('2024-03-16 23:59:44'), Timestamp('2024-03-16 23:59:44'), Timestamp('2024-03-16 23:59:44'), Timestamp('2024-03-16 23:59:47'), Timestamp('2024-03-16 23:59:48'), Timestamp('2024-03-16 23:59:48'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49')]
Price = [0.6729, 0.6728, 0.6728, 0.6728, 0.6728, 0.673, 0.6728, 0.6729, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6729, 0.6728]
Side = [-1, -1, -1, 1, -1, 1, -1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, -1]
Amount = [1579.2963000000002, 7.400799999999999, 6.728, 177.61919999999998, 797.2679999999999, 33650.0, 131.196, 48.448800000000006, 0.6728, 0.6728, 0.6728, 6.728, 0.6728, 1.3456, 0.6728, 0.6728, 0.6728, 0.6728, 0.6729, 0.6728]
buy = [math.nan, math.nan, math.nan, 177.61919999999998, math.nan, 33650.0, math.nan, 48.448800000000006, math.nan, math.nan, math.nan, math.nan, math.nan, math.nan, math.nan, math.nan, math.nan, math.nan, 49.121700000000004, math.nan]
df = pd.DataFrame({
'Date':Date,
'Price':Price,
'Side':Side,
'Amount':Amount,
'buy':buy
})
print(df)
I got buy
column using
df['buy'] = df[df['Side'] == 1].groupby([df['Date'].dt.floor('H'), 'Price'])['Amount'].cumsum()
But I want to get 0 in the buy
column instead of nan values, if this price has not yet been met in the group or the previous value of the cumulative sum
Result buy
column need - [0,0,0,177.6192,177.6192,33650, 177.6192,48.4488, 177.6192,.....]
How can I implement this?
You could reindex
, ffill
and fillna
:
df['buy'] = (df[df['Side'] == 1].groupby([df['Date'].dt.floor('H'), 'Price'])['Amount'].cumsum()
.reindex(df.index).ffill().fillna(0)
)
Or in two steps:
df['buy'] = df[df['Side'] == 1].groupby([df['Date'].dt.floor('H'), 'Price'])['Amount'].cumsum()
df['buy'] = df['buy'].ffill().fillna(0)
Output:
Date Price Side Amount buy
0 2024-03-16 23:59:42 0.6729 -1 1579.2963 0.0000
1 2024-03-16 23:59:42 0.6728 -1 7.4008 0.0000
2 2024-03-16 23:59:44 0.6728 -1 6.7280 0.0000
3 2024-03-16 23:59:44 0.6728 1 177.6192 177.6192
4 2024-03-16 23:59:44 0.6728 -1 797.2680 177.6192
5 2024-03-16 23:59:47 0.6730 1 33650.0000 33650.0000
6 2024-03-16 23:59:48 0.6728 -1 131.1960 33650.0000
7 2024-03-16 23:59:48 0.6729 1 48.4488 48.4488
8 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
9 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
10 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
11 2024-03-16 23:59:49 0.6728 -1 6.7280 48.4488
12 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
13 2024-03-16 23:59:49 0.6728 -1 1.3456 48.4488
14 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
15 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
16 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
17 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
18 2024-03-16 23:59:49 0.6729 1 0.6729 49.1217
19 2024-03-16 23:59:49 0.6728 -1 0.6728 49.1217
Alternatively, if you want to fill with the first valid non-NA you could use:
df['buy'] = df[df['Side'] == 1].groupby([df['Date'].dt.floor('H'), 'Price'])['Amount'].cumsum()
m = df['buy'].notna()
df['buy'] = df['buy'].fillna(df.loc[df['buy'].first_valid_index(), 'buy']).where(m.cummax(), 0)
Output:
Date Price Side Amount buy
0 2024-03-16 23:59:42 0.6729 -1 1579.2963 0.0000
1 2024-03-16 23:59:42 0.6728 -1 7.4008 0.0000
2 2024-03-16 23:59:44 0.6728 -1 6.7280 0.0000
3 2024-03-16 23:59:44 0.6728 1 177.6192 177.6192
4 2024-03-16 23:59:44 0.6728 -1 797.2680 177.6192
5 2024-03-16 23:59:47 0.6730 1 33650.0000 33650.0000
6 2024-03-16 23:59:48 0.6728 -1 131.1960 177.6192
7 2024-03-16 23:59:48 0.6729 1 48.4488 48.4488
8 2024-03-16 23:59:49 0.6728 -1 0.6728 177.6192
9 2024-03-16 23:59:49 0.6728 -1 0.6728 177.6192
10 2024-03-16 23:59:49 0.6728 -1 0.6728 177.6192
11 2024-03-16 23:59:49 0.6728 -1 6.7280 177.6192
12 2024-03-16 23:59:49 0.6728 -1 0.6728 177.6192
13 2024-03-16 23:59:49 0.6728 -1 1.3456 177.6192
14 2024-03-16 23:59:49 0.6728 -1 0.6728 177.6192
15 2024-03-16 23:59:49 0.6728 -1 0.6728 177.6192
16 2024-03-16 23:59:49 0.6728 -1 0.6728 177.6192
17 2024-03-16 23:59:49 0.6728 -1 0.6728 177.6192
18 2024-03-16 23:59:49 0.6729 1 0.6729 49.1217
19 2024-03-16 23:59:49 0.6728 -1 0.6728 177.6192