Search code examples
pythonpandasdatetimeaveragemoving-average

Continue column with average of previous values


I have a number of columns all indexed by datetime and I need to project a forecast for some of the columns based on the past n (lets say in this case 4) days.

          A  B  C
31-12-201917 19 49
1-1-2020  4  9  2
2-1-2020  2  23 3
3-1-2020  8  23 7
4-1-2020  6  21 4
5-1-2020  5
6-1-2020  5
7-1-2020  5
8-1-2020  5
9-1-2020  5
10-1-2020 5 
11-1-2020 5

So the desired output would have the average of the last 4 days as the respective value for that column for each of the future dates (as shown in column A). Note it should only take the last 4 days as opposed to all dates in the dataframe.

I've tried various rolling ave functions but all seem to require a new column!

Could anyone assist?


Solution

  • So by repeating 5 values in the column, I assume it is as simple as filling NaN with a constant value obtained from some averaging, right?

    So how about:

    for c in df.columns:
        df[c].fillna(
            df[df[c].notna()].tail(4)[c].mean(),
            inplace=True
        )
    
    • .fillna() simply fills NaN values
    • df[df[c].notna()].tail(4) gives you the last 4 non-nan days. feel free to change the value of .tail()
    • [c].mean() averages the values for column c