Search code examples
pythonpandasdataframedata-munging

Pandas aggreagte each column with 4 previous rows


I have a dataframe:

df = Feat1  Feat2   Feat3
     1        3       2
     4        7       1
     6        1       6
     2        9       4
     5        8       5
     0        3       1

I want to create a dataframe, such that for each row t > 4, each column will be t-1 + 0.75*t-2 + 0.5*t-3 + 0.25*t-4

So for this df I will have:

df = Feat1  Feat2   Feat3
     1        3       2
     4        7       1
     6        1       6
     2        9       4
    8.75      14      9.5
   14.25      23      6.75

What is the best way to do so?


Solution

  • Recursive calculations are not vectorisable, for improve performance is used numba:

    from numba import jit
    
    @jit(nopython=True)
    def f(d):
        for t in range(4, d.shape[0]):
            d[t] = d[t-1] + 0.75*d[t-2] + 0.5*d[t-3] + 0.25*d[t-4]
        return d
    
    df = pd.DataFrame(f(df.to_numpy().astype(float)), index=df.index, columns=df.columns)
    print (df)
       Feat1  Feat2  Feat3
    0   1.00    3.0   2.00
    1   4.00    7.0   1.00
    2   6.00    1.0   6.00
    3   2.00    9.0   4.00
    4   8.75   14.0   9.50
    5  14.25   23.0  15.75