The resample function doesn't work with 5h or 10h intervals. It starts okay and then it stops aligning on the day.
d = {'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'}
df = read.resample('5H').agg(d)
df
output:
Open High Low Close
Date
2024-01-21 00:00:00 317.3 319.5 316.1 317.4
2024-01-21 05:00:00 317.4 319.4 317.2 319.0
2024-01-21 10:00:00 319.0 319.6 318.0 319.1
2024-01-21 15:00:00 319.1 321.6 317.5 320.6
2024-01-21 20:00:00 320.6 321.7 318.2 319.5
2024-01-22 01:00:00 319.5 320.1 314.1 317.2
2024-01-22 06:00:00 317.2 318.6 310.7 312.8
2024-01-22 11:00:00 312.8 315.3 308.4 311.1
2024-01-22 16:00:00 311.2 312.5 303.6 308.5
2024-01-22 21:00:00 308.5 308.7 304.8 308.2
2024-01-23 02:00:00 308.1 311.7 308.1 309.9
2024-01-23 07:00:00 310.0 310.2 291.7 295.9
It starts okay until the next day 2024-01-22.
I want to resample data like this:
It works correctly with 2h, 3h, 4h, or 6h intervals
d = {'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'}
df = read.resample('4H').agg(d)
df
output:
Open High Low Close
Date
2024-01-21 00:00:00 317.3 319.5 316.1 318.5
2024-01-21 04:00:00 318.5 318.9 317.2 317.5
2024-01-21 08:00:00 317.5 319.6 317.3 319.3
2024-01-21 12:00:00 319.3 319.6 317.5 318.9
2024-01-21 16:00:00 318.9 321.6 317.9 320.6
2024-01-21 20:00:00 320.6 321.7 318.2 318.6
2024-01-22 00:00:00 318.7 320.3 315.3 316.7
2024-01-22 04:00:00 316.7 318.6 314.1 317.8
2024-01-22 08:00:00 317.9 317.9 310.7 313.5
2024-01-22 12:00:00 313.4 315.3 308.4 311.1
2024-01-22 16:00:00 311.2 312.5 303.6 306.6
If you want to resample
with 5h periods except the last one, so that it always starts at midnight, you should normalize
your dates and use them as grouper for groupby.resample
:
Setup for this example
N = 1000
df = pd.DataFrame(np.random.random((N, 4)),
columns=['Open', 'High', 'Low', 'Close'],
index=pd.date_range('2024-01-21', periods=N, freq='10min')
.rename('Date')
)
d = {'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'}
Code
out = (df.groupby(df.index.normalize(), group_keys=False)
.resample('5h', origin='start_day').agg(d)
)
Example output (first three days):
Open High Low Close
Date
2024-01-21 00:00:00 0.854852 0.991030 0.001056 0.434836
2024-01-21 05:00:00 0.423933 0.993614 0.024808 0.970102
2024-01-21 10:00:00 0.114825 0.992917 0.000970 0.423687
2024-01-21 15:00:00 0.368081 0.934378 0.019684 0.474279
2024-01-21 20:00:00 0.618224 0.991313 0.004855 0.828402
2024-01-22 00:00:00 0.314505 0.981417 0.015882 0.672777
2024-01-22 05:00:00 0.637003 0.928499 0.018432 0.081630
2024-01-22 10:00:00 0.798733 0.939966 0.002545 0.779787
2024-01-22 15:00:00 0.786353 0.980841 0.007235 0.622986
2024-01-22 20:00:00 0.455437 0.985423 0.207164 0.492761
2024-01-23 00:00:00 0.759215 0.943299 0.012884 0.610557
2024-01-23 05:00:00 0.491669 0.907216 0.022135 0.204769
2024-01-23 10:00:00 0.035259 0.950254 0.020136 0.533030
2024-01-23 15:00:00 0.977045 0.958560 0.000002 0.791182
2024-01-23 20:00:00 0.327000 0.953044 0.021118 0.559102