Search code examples
pythonpandasdataframepercentile

Dataframe percentile using a rolling value from another column


I have below dataframe structure as a sample.

I want to obtain a column where it calculates the percentile of the "price column" based on the value of the "percentile" column, using a rolling n period lookback.

Is it possible? I tried using some kind of a lambda function and use the .apply syntax but couldn't get it to work.

        date     percentile  price   desired_row
    2019-11-08  0.355556    0.6863    36th percentile of price of last n period
    2019-11-11  0.316667    0.6851    32nd percentile of price of last n period
    2019-11-12  0.305556    0.6841    ...
    2019-11-13  0.302778    0.6838    ...
    2019-11-14  0.244444    0.6798    ...

Thanks!!


Solution

  • Based on this answer, you can use rolling on the column price with the column percentile in index and then use quantile in apply with the parameter raw=False:

    window = 3
    df['desired_row'] = df.set_index('percentile')['price'].rolling(window)\
                          .apply(lambda x: x.quantile(q=x.index[-1]), raw=False).values
    print (df)
             date  percentile   price  desired_row
    0  2019-11-08    0.355556  0.6863          NaN
    1  2019-11-11    0.316667  0.6851          NaN
    2  2019-11-12    0.305556  0.6841     0.684711
    3  2019-11-13    0.302778  0.6838     0.683982
    4  2019-11-14    0.244444  0.6798     0.681756
    

    you can change the interpolation parameter in quantile depending on your need.