Search code examples
pandastime-seriesohlcresample

Pandas Resample OHLC data Skipping time


I have a time series OHLC 1 min date.

date,open,high,low,close,volume
2023-04-20 09:15:00+05:30,262.0,267.85,262.0,262.5,7300
2023-04-20 09:16:00+05:30,262.5,264.3,261.8,262.6,6300
2023-04-20 09:17:00+05:30,262.6,266.95,262.0,265.75,3600
2023-04-20 09:18:00+05:30,265.75,274.25,265.75,274.25,2200
2023-04-20 09:19:00+05:30,273.95,275.8,271.95,274.9,5750
2023-04-20 09:20:00+05:30,275.0,284.4,275.0,283.55,6350
2023-04-20 09:21:00+05:30,285.0,286.95,282.6,283.55,3650
2023-04-20 09:22:00+05:30,283.0,283.45,280.1,281.2,3150
2023-04-20 09:23:00+05:30,281.2,286.55,281.2,286.55,1800
2023-04-20 09:24:00+05:30,286.6,288.5,286.5,287.2,1600
2023-04-20 09:25:00+05:30,287.2,288.75,286.4,286.7,1450
2023-04-20 09:26:00+05:30,287.0,288.0,287.0,287.45,500
2023-04-20 09:27:00+05:30,287.45,287.45,285.0,285.0,900
2023-04-20 09:28:00+05:30,284.9,286.85,283.55,286.4,2600
2023-04-20 09:29:00+05:30,286.4,286.4,285.0,286.4,1600
2023-04-20 09:30:00+05:30,287.25,291.7,286.9,290.45,4500
2023-04-20 09:31:00+05:30,290.55,291.9,289.9,290.9,1750
2023-04-20 09:32:00+05:30,289.25,289.3,286.7,286.7,1550
2023-04-20 09:33:00+05:30,286.7,290.15,286.7,288.0,250
2023-04-20 09:34:00+05:30,288.0,291.0,288.0,289.85,850
2023-04-20 09:35:00+05:30,289.85,289.85,289.4,289.85,1500
2023-04-20 09:36:00+05:30,289.85,290.0,288.1,288.75,4050
2023-04-20 09:37:00+05:30,288.75,288.75,287.45,288.05,1150
2023-04-20 09:38:00+05:30,288.05,288.45,285.85,287.45,1650
2023-04-20 09:39:00+05:30,287.45,293.25,287.45,291.15,2250
2023-04-20 09:40:00+05:30,291.1,291.15,290.7,291.15,550
2023-04-20 09:41:00+05:30,291.15,292.0,291.0,291.95,550
2023-04-20 09:42:00+05:30,291.95,291.95,288.35,288.55,200
2023-04-20 09:43:00+05:30,288.55,290.0,288.0,288.55,1350
2023-04-20 09:44:00+05:30,288.55,290.0,288.55,289.0,100
2023-04-20 09:45:00+05:30,289.0,289.0,286.05,286.05,400
2023-04-20 09:46:00+05:30,286.05,286.05,282.25,283.3,2650
2023-04-20 09:47:00+05:30,283.3,283.5,283.3,283.3,450
2023-04-20 09:48:00+05:30,283.3,283.3,279.85,279.85,750

I need to resample the same to 4 min data starting from 9:15.

I wrote the following code.

import pandas as pd
df = pd.read_csv('re_s.csv',parse_dates=True,index_col=0)
ohlc_dict = {'open':'first','high':'max','low':'min','close': 'last','volume':'sum'}
new_df=df.resample('4T').apply(ohlc_dict).ffill().between_time('09:15','15:30')

>>> print(new_df)
                             open    high     low   close  volume
date
2023-04-20 09:16:00+05:30  262.50  275.80  261.80  274.90   17850
2023-04-20 09:20:00+05:30  275.00  286.95  275.00  286.55   14950
2023-04-20 09:24:00+05:30  286.60  288.75  285.00  285.00    4450
2023-04-20 09:28:00+05:30  284.90  291.90  283.55  290.90   10450
2023-04-20 09:32:00+05:30  289.25  291.00  286.70  289.85    4150
...                           ...     ...     ...     ...     ...
2023-04-24 13:44:00+05:30  282.35  285.95  282.00  283.45    8250
2023-04-24 13:48:00+05:30  283.45  284.50  274.05  277.35    9900
2023-04-24 13:52:00+05:30  277.10  280.70  276.00  276.15   12150
2023-04-24 13:56:00+05:30  276.15  279.00  274.90  275.10    5600
2023-04-24 14:00:00+05:30  275.10  282.60  273.00  280.50    4350

[448 rows x 5 columns]

If we see , the same starts from 09:16 , so its skipping by 1 min.

I am trying to get the data starting from 9:15.

Example: The values should be as follows (Taken from the portal) 09:15 09:19 09:23 09:27 09:31

Thanks Saurabh

Tried to use resample method and between_time method.


Solution

  • Code

    ohlc_dict = {'open':'first','high':'max','low':'min','close': 'last','volume':'sum'}
    out = df.resample('4T', origin='start').agg(ohlc_dict)
    

    out

                                open    high    low     close   volume
    date                    
    2023-04-20 09:15:00+05:30   262.00  274.25  261.80  274.25  19400
    2023-04-20 09:19:00+05:30   273.95  286.95  271.95  281.20  18900
    2023-04-20 09:23:00+05:30   281.20  288.75  281.20  287.45  5350
    2023-04-20 09:27:00+05:30   287.45  291.70  283.55  290.45  9600
    2023-04-20 09:31:00+05:30   290.55  291.90  286.70  289.85  4400
    2023-04-20 09:35:00+05:30   289.85  290.00  285.85  287.45  8350
    2023-04-20 09:39:00+05:30   287.45  293.25  287.45  288.55  3550
    2023-04-20 09:43:00+05:30   288.55  290.00  282.25  283.30  4500
    2023-04-20 09:47:00+05:30   283.30  283.50  279.85  279.85  1200