Search code examples
pythonpandasmoving-averagerolling-computation

Pandas rolling mean with offset by (not continuously available) date


given the following example table

Index Date Weekday Value
1 05/12/2022 2 10
2 06/12/2022 3 20
3 07/12/2022 4 40
4 09/12/2022 6 10
5 10/12/2022 7 60
6 11/12/2022 1 30
7 12/12/2022 2 40
8 13/12/2022 3 50
9 14/12/2022 4 60
10 16/12/2022 6 20
11 17/12/2022 7 50
12 18/12/2022 1 10
13 20/12/2022 3 20
14 21/12/2022 4 10
15 22/12/2022 5 40

I want to calculate a rolling average of the last three observations (at least) a week ago. I cannot use .shift as some dates are randomly missing, and .shift would therefore not produce a reliable output.

Desired output example for last three rows in the example dataset:

Index 13: Avg of indices 8, 7, 6 = (30+40+50) / 3 = 40

Index 14: Avg of indices 9, 8, 7 = (40+50+60) / 3 = 50

Index 15: Avg of indices 9, 8, 7 = (40+50+60) / 3 = 50

What would be a working solution for this? Thanks!

Thanks!


Solution

  • MOSTLY inspired from @Aidis you could, make his solution an apply:

    df['mean']=df.apply(lambda y:  df["Value"][df['Date'] <= y['Date'] - pd.Timedelta(1, "W")].tail(3).mean(), axis=1)
    

    or spliting the data at each call which may run faster if you have lots of data (to be tested):

    df['mean']=df.apply(lambda y:  df.loc[:y.name, "Value"][ df.loc[:y.name,'Date'] <= y['Date'] - pd.Timedelta(1, "W")].tail(3).mean(), axis=1)
    

    which returns:

        Index       Date  Weekday  Value       mean
    0       1 2022-12-05        2     10        NaN
    1       2 2022-12-06        3     20        NaN
    2       3 2022-12-07        4     40        NaN
    3       4 2022-12-09        6     10        NaN
    4       5 2022-12-10        7     60        NaN
    5       6 2022-12-11        1     30        NaN
    6       7 2022-12-12        2     40  10.000000
    7       8 2022-12-13        3     50  15.000000
    8       9 2022-12-14        4     60  23.333333
    9      10 2022-12-16        6     20  23.333333
    10     11 2022-12-17        7     50  36.666667
    11     12 2022-12-18        1     10  33.333333
    12     13 2022-12-20        3     20  40.000000
    13     14 2022-12-21        4     10  50.000000
    14     15 2022-12-22        5     40  50.000000