My time series is something like this:
TranID,Time,Price,Volume,SaleOrderVolume,BuyOrderVolume,Type,SaleOrderID,SaleOrderPrice,BuyOrderID,BuyOrderPrice
1,09:25:00,137.69,200,200,453,B,182023,137.69,241939,137.69
2,09:25:00,137.69,253,300,453,S,184857,137.69,241939,137.69
3,09:25:00,137.69,47,300,200,B,184857,137.69,241322,137.69
4,09:25:00,137.69,153,200,200,B,219208,137.69,241322,137.69
I can do the aggregation by suming all the Volume
res = df.resample('t').agg({'Volume': 'sum'})
But I would like to aggregate the volume and type columns according to volume and type column in the way that when type is S then add the volume otherwise remove the volume. If the total volume after aggregation is negative then type is S otherwise type is B.
In the above example, after I aggregate the volume, the total volume would become
200 - 253 + 300 + 200 = 447
and the type is B since 447 > 0
result:
Time,Volume,Type
09:25:00,447,B
The simplest is to multiply the volume by 1 or -1 depending on the value in Type with map
. then assign
the column Type depending on the result of the summed volume.
res = (
(df['Volume']*df['Type'].map({'S':-1, 'B':1}))
.groupby(df['Time']).sum()#here should work with resample,
#just your input is not the right format to use resample
.reset_index(name='Volume')
.assign(Type=lambda x: np.where(x['Volume']>0, 'B', 'S'))
)
print(res)
Time Volume Type
0 09:25:00 147 B # you used 2 columns to calculate your result volume 447?