If I have a Dataframe and I'd like to insert a summary column at the beginning I can run
df.insert(0, 'Average', df.mean(axis='columns'))
Say I have a MultiIndexed dataframe of the form
df = pd.DataFrame()
for l1 in ('a', 'b'):
for l2 in ('one', 'two'):
df[l1, l2] = np.random.random(size=5)
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['L1', 'L2'])
L1 a b
L2 one two one two
0 0.585409 0.563870 0.535770 0.868020
1 0.404546 0.102884 0.254945 0.362751
2 0.475362 0.601632 0.476761 0.665126
3 0.926288 0.615655 0.257977 0.668778
4 0.509069 0.706685 0.355842 0.891862
How do I add the mean of all the one
columns and all the two
columns to the first two columns of this DataFrame and call it 'Average'
?
EDIT:
Expected output would be df.mean(level=1, axis=1)
but inserted into the first two columns of the frame with the L1 label 'Average'
. I was hoping the following would work:
df.insert(0, 'Average', df.mean(level=1, axis=1))
IIUC, You just need to groupby
to calculate the mean, and then do a bit of work with the columns of the resulting series:
s = df.groupby(level=1, axis=1).mean()
s.columns = pd.MultiIndex.from_product([['Average'], s.columns])
pd.concat([s, df], 1)
Average a b
one two one two one two
0 0.517939 0.713116 0.531990 0.578338 0.503889 0.847894
1 0.571197 0.676809 0.698986 0.425227 0.443409 0.928391
2 0.689653 0.399053 0.843179 0.069174 0.536126 0.728931
3 0.288367 0.197891 0.026974 0.026774 0.549761 0.369009
4 0.449904 0.590919 0.372560 0.556332 0.527247 0.625506