Search code examples
pandasmeancalculated-columnscalculation

Calculate a Rolling Mean Average of Last Two Weeks of Daily Data by Month and Year


I am trying to calculate a rolling 14-day month-end average of the last 14 days of each month with panda dfs and I'm not sure how to do it so that the final answer is the rolling mean of the last 14 days of each month for all the months of the year. My data looks like this (Month, Day, Year, Var1 columns):

Month   Day Year    Var1    Daily Rolling Mean
1   1   2022    8   n/a
1   2   2022    8   n/a
1   3   2022    17  n/a
1   4   2022    1   n/a
1   5   2022    10  n/a
1   6   2022    21  n/a
1   7   2022    9   n/a
1   8   2022    18  n/a
1   9   2022    18  n/a
1   10  2022    16  n/a
1   11  2022    2   n/a
1   12  2022    3   n/a
1   13  2022    24  n/a
1   14  2022    13  n/a
1   15  2022    4   n/a
1   16  2022    9   n/a
1   17  2022    2   2
1   18  2022    0   1
1   19  2022    22  8
1   20  2022    14  9.5
1   21  2022    14  10.4
1   22  2022    18  11.66666667
1   23  2022    22  13.14285714
1   24  2022    18  13.75
1   25  2022    5   12.77777778
1   26  2022    23  13.8
1   27  2022    13  13.72727273
1   28  2022    11  13.5
1   29  2022    7   13
1   30  2022    24  13.78571429
1   31  2022    15  13.86666667
2   1   2022    20  n/a
2   2   2022    19  n/a
2   3   2022    5   n/a
2   4   2022    5   n/a
2   5   2022    15  n/a
2   6   2022    8   n/a
2   7   2022    2   n/a
2   8   2022    20  n/a
2   9   2022    10  n/a
2   10  2022    12  n/a
2   11  2022    1   n/a
2   12  2022    20  n/a
2   13  2022    2   n/a
2   14  2022    7   7
2   15  2022    21  14
2   16  2022    5   11
2   17  2022    10  10.75
2   18  2022    9   10.4
2   19  2022    4   9.333333333
2   20  2022    13  9.857142857
2   21  2022    15  10.5
2   22  2022    25  12.11111111
2   23  2022    22  13.1
2   24  2022    14  13.18181818
2   25  2022    2   12.25
2   26  2022    12  12.23076923
2   27  2022    15  12.42857143
2   28  2022    19  12.86666667

The rolling mean column might column is shown also. The final answer needs to look like this:

final df:

Month Year Last 14-day Rolling Mean
1     2022      13.86666667
2     2022      12.86666667

Thank you for any help.


Solution

  • This will give you the answer you seek by dynamically creating a 14 day prior column for each month then filtering the data where the day for each month is greater than the 14_Days_Prior column

    df['14_Days_Prior'] = df.groupby(['Month'])['Day'].transform('max') - 14
    df.loc[df['Day'] >= df['14_Days_Prior']].groupby(['Month', 'Year'], as_index=False)['Var1'].agg('mean')