I am trying to calculate the Moving Average on the following dataframe but i have trouble joining the result back to the dataframe
The dataframe is : (Moving Average values are displayed in parentheses)
Key1 Key2 Value MovingAverage
1 2 1 (Nan)
1 7 2 (Nan)
1 8 3 (Nan)
2 5 1 (Nan)
2 3 2 (Nan)
2 2 3 (Nan)
3 7 1 (Nan)
3 5 2 (Nan)
3 8 3 (Nan)
4 7 1 (1.33)
4 2 2 (2)
4 9 3 (Nan)
5 8 1 (2.33)
5 3 2 (Nan)
5 9 3 (Nan)
6 2 1 (2)
6 7 2 (1.33)
6 9 3 (3)
The Code is :
import pandas as pd
d = {'Key1':[1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6], 'Key2':[2,7,8,5,3,2,7,5,8,7,2,9,8,3,9,2,7,9],'Value':[1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3]}
df = pd.DataFrame(d)
print(df)
MaDf = df.groupby(['Key2'])['Value'].rolling(window=3).mean().to_frame('mean')
print (MaDf)
If you run the code it will correctly calculate the Moving Average based on 'Key2' and 'Value' but i can't find the way to correctly reinsert it back to the original dataframe (df)
Remove first level of MultiIndex
by Series.reset_index
with drop=True
for align by second level:
df['mean'] = (df.groupby('Key2')['Value']
.rolling(window=3)
.mean()
.reset_index(level=0, drop=True))
print (df)
Key1 Key2 Value mean
0 1 2 1 NaN
1 1 7 2 NaN
2 1 8 3 NaN
3 2 5 1 NaN
4 2 3 2 NaN
5 2 2 3 NaN
6 3 7 1 NaN
7 3 5 2 NaN
8 3 8 3 NaN
9 4 7 1 1.333333
10 4 2 2 2.000000
11 4 9 3 NaN
12 5 8 1 2.333333
13 5 3 2 NaN
14 5 9 3 NaN
15 6 2 1 2.000000
16 6 7 2 1.333333
17 6 9 3 3.000000
If default RangeIndex
is possible use Series.sort_index
:
df['mean'] = (df.groupby(['Key2'])['Value']
.rolling(window=3)
.mean()
.sort_index(level=1)
.values)
print (df)
Key1 Key2 Value mean
0 1 2 1 NaN
1 1 7 2 NaN
2 1 8 3 NaN
3 2 5 1 NaN
4 2 3 2 NaN
5 2 2 3 NaN
6 3 7 1 NaN
7 3 5 2 NaN
8 3 8 3 NaN
9 4 7 1 1.333333
10 4 2 2 2.000000
11 4 9 3 NaN
12 5 8 1 2.333333
13 5 3 2 NaN
14 5 9 3 NaN
15 6 2 1 2.000000
16 6 7 2 1.333333
17 6 9 3 3.000000