Search code examples
pythonpython-3.xpandas

pandas rolling mean and stacking previous values


I have a pandas dataframe of shape (2000,1) and I would like to compute rolling means but also keep the previous values as a lagged variable.

Assuming the Series:

1
2
3
4
5
6
7
8
9
10

with a rolling window of 3, I would like:

1,2,3,mean(4,5,6)
4,5,6,mean(7,8,9)

I am able to use the rolling function:

train_ds=train_ds.var1.rolling(3).mean()

but this does not produce the above structure for me since I am unable to stack the previous values.


Solution

  • I'm not sure about your expected outcome, but you could reshape the series into a dataframe with three columns, then get the mean of the shifted rows in the fourth column:

    n = 3
    df = pd.DataFrame(s.to_numpy()[: len(s) - len(s) % n].reshape(-1, n))
    df["mean"] = df.agg("mean", axis=1).shift(-1)
    
       0  1  2  mean
    0  1  2  3   5.0
    1  4  5  6   8.0
    2  7  8  9   NaN
    

    Or if you want a series of strings as the outcome:

    s = df.astype(str).agg(", ".join, axis=1)
    
    0    1, 2, 3, 5.0
    1    4, 5, 6, 8.0
    2    7, 8, 9, nan