I have a dataframe like this:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({"ID":[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
"A":[30, 20, 10, 20, 60, 80, 90, 70, 120, 150, 120, 140]})
I would like to create a new column "B" with the mean of every 4 rows (df["A"]) as a window. And the mean values should be repeated in those 4 rows, but as another column. So the result should be like this:
df
Out[6]:
ID A B
0 1 30 20.0
1 2 20 20.0
2 3 10 20.0
3 4 20 20.0
4 5 60 75.0
5 6 80 75.0
6 7 90 75.0
7 8 70 75.0
8 9 120 132.5
9 10 150 132.5
10 11 120 132.5
11 12 140 132.5
I tried something like this df["B"] = df.rolling(window=4)['A'].mean()
, but it didn't work as expected. Anyone could help me?
You can't use rolling
here as the window is sliding, not fixed.
You need to use the floor division of a range as grouper for groupby.transform('mean')
:
import numpy as np
df['B'] = df.groupby(np.arange(len(df))//4)['A'].transform('mean')
Or df.index//4
in place of np.arange(len(df))//4
if you already have a range index like in your example.
Output:
ID A B
0 1 30 20.0
1 2 20 20.0
2 3 10 20.0
3 4 20 20.0
4 5 60 75.0
5 6 80 75.0
6 7 90 75.0
7 8 70 75.0
8 9 120 132.5
9 10 150 132.5
10 11 120 132.5
11 12 140 132.5