python dataframe lambda apply rolling-computation

Python dataframe - rolling max and min search on different colum within lines until max

I can't find a method to solve this logical challenge with Python and Dataframes, searching for a lowest value in a nr of rows, where the nr of rows is dependent on a previous calc.

Note: I am pretty much starting level with python and dataframes - so apologies if I don't use the best code or formatting. Feel free to correct or suggest improvements so that I can learn.

df.columns=['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
df['Timestamp'] = pd.to_datetime(df['Timestamp'], unit='ms')
df.set_index('Timestamp', inplace=True)
maxframes = 5
df['max5'] = df['High'].shift(-maxframes).rolling(maxframes).max()

This results in:

Timestamp	Open	High	Low	Close	Volume	max5
2023-02-01 00:00:00	23125.13	23140.47	23124.41	23133.74	199.09155	23168.90
2023-02-01 00:01:00	23133.74	23137.00	23123.15	23123.24	153.92866	23168.90
2023-02-01 00:02:00	23123.16	23145.99	23123.01	23142.98	204.22769	23168.90
2023-02-01 00:03:00	23144.00	23168.90	23141.75	23157.56	385.73800	23159.26
2023-02-01 00:04:00	23158.68	23159.26	23131.97	23133.11	209.27799	23136.90
2023-02-01 00:05:00	23133.11	23136.90	23086.73	23087.06	514.67716	23112.84
2023-02-01 00:06:00	23087.06	23112.84	23084.00	23101.51	425.65720
2023-02-01 00:07:00	23100.03	23105.14	23086.08	23086.35	272.72273
2023-02-01 00:08:00	23086.35	23104.53	23084.35	23090.76	218.65485
2023-02-01 00:09:00	23089.85	23091.31	23059.90	23067.90	558.62597

The challenge: max5 results in the highest value in column High of the next 4 rows. for each row I need to find the lowest value in colum Low in next rows (=from this row until that row where max5 is found).

example For row 1 (= 2023-02-01 00:00:00) max5 = 23168.90, found on 2023-02-01 00:03:00 the lowest low from 2023-02-01 00:00:00 to 2023-02-01 00:03:00 is 23123.01

I guess I need to create a lambda or a separate function and use it in an apply.

A lambda seems more appropriate for shorter logic, this is a bit too complex

With a function I enountered the next issues:

I can't find a dataframe function that returns the rowid or rownr of the max5 value found in the High column
I can't find how I can calculate the lowest value in the low column in that column, in the next rows until the row where max5 is found

Solution

I hope I've understood your question right:

def my_func(x):
    if x.name > len(df) - 5:
        return np.nan
    idx_max = df.loc[x.name+1:x.name+5, 'High'].idxmax()
    min_val = df.loc[x.name+1:idx_max, 'Low'].min()
    return df['High'].iat[idx_max], min_val

df = df.reset_index()
df[['max5', 'min_val']] = df.apply(my_func, axis=1, result_type='expand')
print(df)

Prints:

            Timestamp      Open      High       Low     Close     Volume      max5   min_val
0 2023-02-01 00:00:00  23125.13  23140.47  23124.41  23133.74  199.09155  23168.90  23123.01
1 2023-02-01 00:01:00  23133.74  23137.00  23123.15  23123.24  153.92866  23168.90  23123.01
2 2023-02-01 00:02:00  23123.16  23145.99  23123.01  23142.98  204.22769  23168.90  23141.75
3 2023-02-01 00:03:00  23144.00  23168.90  23141.75  23157.56  385.73800  23159.26  23131.97
4 2023-02-01 00:04:00  23158.68  23159.26  23131.97  23133.11  209.27799  23136.90  23086.73
5 2023-02-01 00:05:00  23133.11  23136.90  23086.73  23087.06  514.67716  23112.84  23084.00
6 2023-02-01 00:06:00  23087.06  23112.84  23084.00  23101.51  425.65720       NaN       NaN
7 2023-02-01 00:07:00  23100.03  23105.14  23086.08  23086.35  272.72273       NaN       NaN
8 2023-02-01 00:08:00  23086.35  23104.53  23084.35  23090.76  218.65485       NaN       NaN
9 2023-02-01 00:09:00  23089.85  23091.31  23059.90  23067.90  558.62597       NaN       NaN