i am having data like following.which is in pandas data frame format.
A B C D E F G
1 1 2 3 1 4 2
1 1 2 4 5 6 7
1 1 2 3 2 3 2
1 1 2 4 5 6 7
2 1 2 3 2 3 4
2 1 2 3 4 3 3
2 1 2 4 5 6 7
here agg_lvl=['A','B','C']
I want to calculate mean|median|sum for G variable by using tail(2) records in each group when data aggregated to agg_lvl.
And my expected output is like this:
expected output for mean:
A B C G
1 1 2 4.5
2 1 2 5
the output will be same for median and sum also,but in place of mean we have to consider median and sum values.
for that i tried the following code but i didn't get the expected output.
df.groupby(agg_lvl,as_index=False).tail(2).agg({'G':'mean'})
can anyone help me tackle this issue.
Thanks in advance.
Use GroupBy.transform
instead agg
for return new column with same shape as filtered DataFrame by tail
:
agg_lvl=['A','B','C']
df = df.groupby(agg_lvl,as_index=False).tail(2)
df['G'] = df.groupby(agg_lvl)['G'].transform('mean')
print (df)
A B C D E F G
2 1 1 2 3 2 3 4.5
3 1 1 2 4 5 6 4.5
5 2 1 2 3 4 3 5.0
6 2 1 2 4 5 6 5.0
EDIT:
df = df.groupby(agg_lvl,as_index=False).tail(2).groupby(agg_lvl,as_index=False)['G'].mean()
print (df)
A B C G
0 1 1 2 4.5
1 2 1 2 5.0