I need to add a new feature that aggregates the last 5 data. When it adds 6th data, then it should forget the first data and consider only the last 5 data sets as shown below. Here is the dummy data frame, new_feature is the expected output.
id feature new_feature
1 a a
2 b a+b
3 c a+b+c
4 d a+b+c+d
5 e a+b+c+d+e
6 f b+c+d+e+f
7 g c+d+e+f+g
Use Series.rolling
with min_periods=1
parameter and sum
:
df = pd.DataFrame({'feature':[1,2,4,5,6,2,3,4,5]})
df['new_feature'] = df['feature'].rolling(5, min_periods=1).sum()
print (df)
feature new_feature
0 1 1.0
1 2 3.0
2 4 7.0
3 5 12.0
4 6 18.0
5 2 19.0
6 3 20.0
7 4 20.0
8 5 20.0