Search code examples
pythonpandasohlc

Pandas: find nearest higher high (OHLC)


I'm trying to find the fastest solution to iterate over each row of Open High Low Close data and count out how many rows exist between the current row's high and the next high that’s greater than or equal to the original. Here is a shortened code example which I think explains it well (I'd be trying to find the final value of _next):

for i in df.index:
    while df.loc[i, 'high'] > df.loc[_next, 'high']:
         _next += 1

I think I need to use a lambda function inside apply but I can't seem to get it right. Been trying to figure this out for days and would love any advice!

Using this data, the function should return [2, 0, 0, 3, 1, 0, 0]

d = {'high': [1.2, 1.1, 1.1, 1.4, 1.3, 1.2, 1.3]}
df = pd.DataFrame(data=d)


    high   rows_to_higher_high
0   1.2    2
1   1.1    0
2   1.1    0
3   1.4    3
4   1.3    1
5   1.2    0
6   1.3    0

Edit: itertuples(name=None, index=True) is the fastest way to do it so far.


Solution

  • using list comprehension and get index to list

    In [166]: d = {'high': [1.2, 1.1, 1.1, 1.4, 1.3, 1.2, 1.3]}
    
    In [167]: df = pd.DataFrame(data=d)
    
    In [168]: df['rows_to_higher_high']=[(df['high'].values[i+1:]>=val).tolist().index(True) if True in (df['high'].values[i+1:]>=val).tolist() else le
         ...: n((df['high'].values[i+1:]>=val).tolist())  for i, val in enumerate(df['high'].values)]
    
    In [169]: df
    Out[169]:
       high  rows_to_higher_high
    0   1.2                    2
    1   1.1                    0
    2   1.1                    0
    3   1.4                    3
    4   1.3                    1
    5   1.2                    0
    6   1.3                    0