Search code examples
pythonpandasdataframerolling-computation

Pandas hybrid rolling mean


I want to consider a window of size 3, in which I take the last value from the b column, and the other 2 values from the a column, like in this example:

df = pd.DataFrame.from_dict({'a':[1, 2, 3, 4, 5], 'b':[10, 20, 30, 40, 50]})
df['hybrid_mean_3'] = [10, 10.5, 11, 15, 19] # how do I calculate this?

   a   b     c
0  1  10  10.0
1  2  20  10.5 # (20+1) / 2
2  3  30  11.0 # (30+2+1) / 3
3  4  40  15.0 # (40+3+2) / 3 
4  5  50  19.0 # (50+4+3) / 3

Solution

  • Using a rolling calculation :

    N = 3
    
    roll = df["a"].rolling(N-1, closed="left", min_periods=1)
    
    df["c"] = (roll.sum().add(df["b"]) / roll.count().add(1)).combine_first(df["b"])
    

    Output :

       a   b     c
    0  1  10  10.0
    1  2  20  10.5
    2  3  30  11.0
    3  4  40  15.0
    4  5  50  19.0