Search code examples
pythonpandasdatetimetime-seriesohlc

Is there any good way to perform calculation and assign values on time series data chunks?


I want to perform some calculation on time series data, and copy the output on specific rows, I have tried several things like resampling data to get the values. Below are the codes and outputs that I was trying to get. It would be very helpful to get some experts tips to do the things efficiently.

Question: 1. I want to get 'High' 'Low' and 'Close of the Day from the dataset which has 15 min time frame. 2. then calculate '(High+Low+Close)/3' and copy this value to next day's chunk of 15min time frame. and so on.

Below is the actual data.

............................................................
Datetime                 |   Open  | High |   Low   |Close
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
2021-01-04 09:15:00+05:30   2575.05 2591.95 2570.0  2571.5 ---------------Day 1
2021-01-04 09:30:00+05:30   2571.35 2575.05 2561.05 2569.35
2021-01-04 09:45:00+05:30   2568.6  2579.8  2568.6  2576.25
2021-01-04 10:00:00+05:30   2576.25 2578.0  2567.0  2567.8
2021-01-04 10:15:00+05:30   2568.25 2570.0  2564.25 2565.0
2021-01-04 10:30:00+05:30   2565.0  2565.15 2556.5  2557.45
2021-01-04 10:45:00+05:30   2558.35 2559.0  2543.85 2553.85
2021-01-04 11:00:00+05:30   2553.85 2555.55 2533.15 2553.55
2021-01-04 11:15:00+05:30   2553.5  2555.9  2547.25 2551.85
2021-01-04 11:30:00+05:30   2551.85 2576.7  2550.15 2571.8
2021-01-04 11:45:00+05:30   2571.45 2571.8  2565.1  2565.7
2021-01-04 12:00:00+05:30   2566.25 2569.9  2561.55 2565.45
2021-01-04 12:15:00+05:30   2564.65 2566.1  2560.25 2562.5
2021-01-04 12:30:00+05:30   2562.5  2568.15 2562.15 2566.0
2021-01-04 12:45:00+05:30   2566.45 2569.9  2565.1  2567.8
2021-01-04 13:00:00+05:30   2567.9  2568.0  2561.9  2565.0
2021-01-04 13:15:00+05:30   2565.0  2565.0  2556.4  2558.0
2021-01-04 13:30:00+05:30   2559.1  2567.9  2557.2  2565.65
2021-01-04 13:45:00+05:30   2565.8  2571.1  2565.0  2571.0
2021-01-04 14:00:00+05:30   2570.95 2575.0  2565.1  2572.75
2021-01-04 14:15:00+05:30   2572.4  2576.0  2567.45 2575.35
2021-01-04 14:30:00+05:30   2575.6  2580.8  2570.95 2578.65
2021-01-04 14:45:00+05:30   2579.25 2584.0  2575.55 2581.8
2021-01-04 15:00:00+05:30   2581.7  2582.7  2575.0  2579.7
2021-01-04 15:15:00+05:30   2579.75 2582.65 2575.0  2578.5
2021-01-05 09:15:00+05:30   2581.15 2623.45 2573.3  2611.0-----------------Day2
2021-01-05 09:30:00+05:30   2611.45 2628.0  2609.0  2628.0
2021-01-05 09:45:00+05:30   2627.0  2635.0  2622.5  2623.0
2021-01-05 10:00:00+05:30   2622.65 2630.0  2621.95 2626.7
2021-01-05 10:15:00+05:30   2626.65 2628.0  2617.0  2617.4
2021-01-05 10:30:00+05:30   2617.4  2624.55 2615.0  2616.95
2021-01-05 10:45:00+05:30   2616.95 2627.0  2615.1  2627.0
2021-01-05 11:00:00+05:30   2627.6  2642.0  2626.55 2635.85
2021-01-05 11:15:00+05:30   2635.85 2644.0  2633.2  2638.85
2021-01-05 11:30:00+05:30   2638.85 2640.05 2633.0  2639.85
2021-01-05 11:45:00+05:30   2639.9  2640.0  2633.0  2636.65
2021-01-05 12:00:00+05:30   2636.05 2639.95 2633.0  2638.95
2021-01-05 12:15:00+05:30   2639.15 2642.0  2635.2  2637.0
2021-01-05 12:30:00+05:30   2637.2  2649.0  2636.8  2644.5
2021-01-05 12:45:00+05:30   2644.8  2650.0  2644.45 2647.9
2021-01-05 13:00:00+05:30   2647.35 2650.0  2644.0  2644.65
2021-01-05 13:15:00+05:30   2644.65 2645.0  2628.55 2634.45
2021-01-05 13:30:00+05:30   2634.5  2643.5  2631.0  2641.0
2021-01-05 13:45:00+05:30   2641.05 2641.4  2633.0  2634.7
2021-01-05 14:00:00+05:30   2634.8  2638.9  2632.0  2632.0
2021-01-05 14:15:00+05:30   2632.0  2638.0  2630.0  2635.95
2021-01-05 14:30:00+05:30   2635.95 2642.0  2633.2  2639.95
2021-01-05 14:45:00+05:30   2639.85 2642.0  2637.25 2639.4
2021-01-05 15:00:00+05:30   2639.25 2658.0  2638.35 2656.9
2021-01-05 15:15:00+05:30   2656.15 2659.0  2646.95 2654.95
2021-01-06 09:15:00+05:30   2660.1  2662.1  2638.5  2639.0 ----------------Day3
2021-01-06 09:30:00+05:30   2639.0  2648.8  2635.2  2646.05

I tried Resampling the data and get the values of high, low and close then calculated h+L+C/3 seperately then copy it bu typing. Below is code to resample

df1 = df.resample('1D', # on='col name'   if the date column is not an index
                 origin='start').agg({'Open':'first',
                                      'Close':'last',
                                      'Low':'min',
                                      'High':'max'})

Is there any good way to get the desired df easily. Below is the desired output

for date 2021-01-04 Day's High = 2591.95, Low = 2533.15, Close = 2578.5
(High+Low+Close)/3 = 2567.68
the value 2567.68 get copied to next day 2021-01-05 showing below 

then wants to calculate H+L+C/3 on 2021-04-05 and then copy it to next day's 15 min time frame data shown below. and so on 
...........................................................................
Datetime                 |   Open  | High |   Low   |Close | ((H+L+C)/3) |
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
2021-01-04 09:15:00+05:30   2575.05 2591.95 2570.0  2571.5               ------Day1 
2021-01-04 09:30:00+05:30   2571.35 2575.05 2561.05 2569.35
2021-01-04 09:45:00+05:30   2568.6  2579.8  2568.6  2576.25
2021-01-04 10:00:00+05:30   2576.25 2578.0  2567.0  2567.8
2021-01-04 10:15:00+05:30   2568.25 2570.0  2564.25 2565.0
2021-01-04 10:30:00+05:30   2565.0  2565.15 2556.5  2557.45
2021-01-04 10:45:00+05:30   2558.35 2559.0  2543.85 2553.85
2021-01-04 11:00:00+05:30   2553.85 2555.55 2533.15 2553.55
2021-01-04 11:15:00+05:30   2553.5  2555.9  2547.25 2551.85
2021-01-04 11:30:00+05:30   2551.85 2576.7  2550.15 2571.8
2021-01-04 11:45:00+05:30   2571.45 2571.8  2565.1  2565.7
2021-01-04 12:00:00+05:30   2566.25 2569.9  2561.55 2565.45
2021-01-04 12:15:00+05:30   2564.65 2566.1  2560.25 2562.5
2021-01-04 12:30:00+05:30   2562.5  2568.15 2562.15 2566.0
2021-01-04 12:45:00+05:30   2566.45 2569.9  2565.1  2567.8
2021-01-04 13:00:00+05:30   2567.9  2568.0  2561.9  2565.0
2021-01-04 13:15:00+05:30   2565.0  2565.0  2556.4  2558.0
2021-01-04 13:30:00+05:30   2559.1  2567.9  2557.2  2565.65
2021-01-04 13:45:00+05:30   2565.8  2571.1  2565.0  2571.0
2021-01-04 14:00:00+05:30   2570.95 2575.0  2565.1  2572.75
2021-01-04 14:15:00+05:30   2572.4  2576.0  2567.45 2575.35
2021-01-04 14:30:00+05:30   2575.6  2580.8  2570.95 2578.65
2021-01-04 14:45:00+05:30   2579.25 2584.0  2575.55 2581.8
2021-01-04 15:00:00+05:30   2581.7  2582.7  2575.0  2579.7
2021-01-04 15:15:00+05:30   2579.75 2582.65 2575.0  2578.5
2021-01-05 09:15:00+05:30   2581.15 2623.45 2573.3  2611.0-   2567.68 -------day2
2021-01-05 09:30:00+05:30   2611.45 2628.0  2609.0  2628.0    2567.68
2021-01-05 09:45:00+05:30   2627.0  2635.0  2622.5  2623.0    2567.68
2021-01-05 10:00:00+05:30   2622.65 2630.0  2621.95 2626.7    2567.68
2021-01-05 10:15:00+05:30   2626.65 2628.0  2617.0  2617.4    2567.68
2021-01-05 10:30:00+05:30   2617.4  2624.55 2615.0  2616.95   2567.68
2021-01-05 10:45:00+05:30   2616.95 2627.0  2615.1  2627.0    2567.68
2021-01-05 11:00:00+05:30   2627.6  2642.0  2626.55 2635.85   2567.68
2021-01-05 11:15:00+05:30   2635.85 2644.0  2633.2  2638.85   2567.68
2021-01-05 11:30:00+05:30   2638.85 2640.05 2633.0  2639.85   2567.68
2021-01-05 11:45:00+05:30   2639.9  2640.0  2633.0  2636.65   2567.68
2021-01-05 12:00:00+05:30   2636.05 2639.95 2633.0  2638.95   2567.68
2021-01-05 12:15:00+05:30   2639.15 2642.0  2635.2  2637.0    2567.68
2021-01-05 12:30:00+05:30   2637.2  2649.0  2636.8  2644.5    2567.68
2021-01-05 12:45:00+05:30   2644.8  2650.0  2644.45 2647.9    2567.68
2021-01-05 13:00:00+05:30   2647.35 2650.0  2644.0  2644.65   2567.68
2021-01-05 13:15:00+05:30   2644.65 2645.0  2628.55 2634.45   2567.68
2021-01-05 13:30:00+05:30   2634.5  2643.5  2631.0  2641.0    2567.68
2021-01-05 13:45:00+05:30   2641.05 2641.4  2633.0  2634.7    2567.68
2021-01-05 14:00:00+05:30   2634.8  2638.9  2632.0  2632.0    2567.68
2021-01-05 14:15:00+05:30   2632.0  2638.0  2630.0  2635.95   2567.68
2021-01-05 14:30:00+05:30   2635.95 2642.0  2633.2  2639.95   2567.68
2021-01-05 14:45:00+05:30   2639.85 2642.0  2637.25 2639.4    2567.68
2021-01-05 15:00:00+05:30   2639.25 2658.0  2638.35 2656.9    2567.68
2021-01-05 15:15:00+05:30   2656.15 2659.0  2646.95 2654.95   2567.68
2021-01-06 09:15:00+05:30   2660.1  2662.1  2638.5  2639.0            -----Day3
2021-01-06 09:30:00+05:30   2639.0  2648.8  2635.2  2646.05

I am new to this. Your Help Is Really Appreciated :)


Solution

  • If possible match first values of DatetimeIndex in df1 with original first value of day in df.index is possible aggregate, create mean (same like sum divide 3), shifting for next day and assign back to new column with forward filling missing values:

    df1 = df.resample('1D', # on='col name'   if the date column is not an index
                     origin='start').agg({'Open':'first',
                                          'Close':'last',
                                          'Low':'min',
                                          'High':'max'})
                                          
    s = df1[['High','Low','Close']].mean(axis=1).shift()
    
    df['new'] = s
    df['new'] = df['new'].ffill()
    

    Another idea is use Series.dt.normalize for grouping per DatetimeIndex without times and for new column is used Index.map:

    df1 = df.groupby(df.index.normalize()).agg({'Open':'first',
                                          'Close':'last',
                                          'Low':'min',
                                          'High':'max'})
                                          
    s = df1[['High','Low','Close']].mean(axis=1).shift()
    
    df['new'] = df.index.normalize().map(s)