open high low close Volume
ctime
2022-11-07 01:00:00.012000+01:00 20900.0 20900.0 20900.0 20900.0 209.00
2022-11-07 01:00:00.019000+01:00 20900.1 20900.1 20900.1 20900.1 1254.00
2022-11-07 01:00:00.111000+01:00 20900.0 20900.0 20900.0 20900.0 11390.50
2022-11-07 01:00:00.188000+01:00 20900.1 20900.1 20900.1 20900.1 20.90
2022-11-07 01:00:00.194000+01:00 20900.0 20900.0 20900.0 20900.0 4995.10
2022-11-07 01:00:00.203000+01:00 20900.0 20900.0 20900.0 20900.0 209.00
2022-11-07 01:00:00.217000+01:00 20900.0 20900.0 20900.0 20900.0 239827.50
2022-11-07 01:00:00.237000+01:00 20900.0 20900.0 20900.0 20900.0 20.90
2022-11-07 01:00:03.028000+01:00 20900.0 20900.0 20900.0 20900.0 20.90
2022-11-07 01:00:03.743000+01:00 20900.0 20900.0 20900.0 20900.0 41.80
I need to gather in periods of 5 trades, that is to say that the open is the first row, that the close is the 5th trade and the minimum and the maximum is the corresponding minimum and maximum of the 5 trades, the same for the volume. there should be only 2 rows left in the exposed example once the transformation is done.
I have tried searching pandas for an approximation but I have not found a satisfactory solution
Define a new series that put the trades into groups of 5, then aggregate per group:
trade_group = np.arange(len(df)) // 5
# The `open` of the group is the `open` of the first trade in the group
# `high` is max of the `high` in the group
# and so on...
df.groupby(trade_group).agg({
"open": "first",
"high": "max",
"low": "min",
"close": "last",
"Volume": "sum"
})