Search code examples
pythonpandasdataframerow

Extract lowest values from pandas dataframe based on the highest value


I get data from a finance API that looks like this:

ind | open high low close

0 | 2.04 2.04 2.04 2.04

1 | 2.06 2.10 2.06 2.10

2 | 2.10 2.11 2.10 2.11

3 | 2.12 2.12 2.10 2.10

4 | 2.06 2.06 2.06 2.06

My Question: Each row represents 1 minute. Once I've identified the highest value, how can I extract the lowest value BEFORE the high, and the lowest value AFTER the high? I want to create two new columns and add both values into the 0 row.

My way of solving isn't working in pandas: I come from an excel/VBA background. In VBA, I would tell Excel to make a variable from the highest value, and then go down each row to compare the "high" value within that row. If that row's "high" value isn't equal to the variable, Excel will grab the lowest value in that row and dump it into an array. I'd tell Excel to keep dumping "low" values into an array until the "high" value within the row matches the variable. I then extract the lowest value from the array and that's how I get the "low BEFORE high" value.

However, I'm having a tough time "moving" within a pandas dataframe. Can you lend me a hand?


Solution

  • Assuming your data frame is sequentially indexed (0, 1, 2, 3, ...):

    idxmax = df['close'].idxmax()
    
    max_value = df.loc[idxmax, 'close']
    lowest_before = df.loc[:idxmax-1, 'close'].min()
    highest_after = df.loc[idxmax+1:, 'close'].max()