Search code examples
pythonpandasdataframerolling-computation

Find the row offset for the maximum value over the next N rows in Pandas?


I have some data in a Pandas DataFrame:

  Price
1 $10
2 $11
3 $15
4 $18
5 $13
6 $4
7 $25

and I am trying to get the offset for the maximum of the next N rows. For example, when ****, the output would look like

  Price  offset
1 $10    2   <- offset is defined as the row offset of the maximum for the next two (N) values ($11 and $15)
2 $11    2   <- similarly, the offset is here is the row offset of the maximum for the next two (N) values ($15 and $18)
3 $15    1
4 $18    1
5 $13    2
6 $4     1    
7 $25    0

I can get the value of the maximum over the next N rows using:

# Here, N=12
df['max_price'] = df['Price'].rolling(12).max().shift(-11)

However, is it possible to get the row offset position for the maximum value of the next N rows using similar logic? Any pointers would be very helpful.


Solution

  • You can use rolling, but you need to cheat a bit by rolling on the reversed series.

    s = df['Price'].str.strip('$').astype(int)
    
    N = 2
    df['offset'] = (s.iloc[::-1]
                    .rolling(N, min_periods=1)
                    .apply(lambda s: s.idxmax())
                    .shift()
                    .iloc[::-1]
                    -df.index
                    ).fillna(0, downcast='infer')
    

    Output:

      Price  offset
    1   $10       2
    2   $11       2
    3   $15       1
    4   $18       1
    5   $13       2
    6    $4       1
    7   $25       0