In pandas with simple columns, you can operate with columns quite straightfowardly. For example, if you have a DataFrame df with numeric columns A and B, you can compute:
df['C']=df['A']+df['B']
Using MultiIndex columns, you can do the same, as long as you operate on all levels of the MultiIndex. For example, if the columns of your DataFrame are (A,X), (A,Y), (B,X), (B,Y), you can write
df[('C','X')]=df[('A','X')]+df[('B','X')]
and the resulting DataFrame will have an additional column, (C,X) which equals df[A,X]+df[B,X]
.
Which is the most straightforward way to do this on specific levels? More specifically, I'm looking for something like
df['C']=df['B']+df['A']
, which creates two columns (C,X) and (C,Y) which are equal to df[A,X]+df[B,X]
and df[A,Y]+df[B,Y]
, but this is a ValueError.
We can reshape
the dataframe to simplify the operations:
df.stack().eval('C = A + B').unstack()
Alternatively, you can use concat
to append the extra level C
:
df.join(pd.concat({'C': df['A'] + df['B']}, axis=1))
A B C
X Y X Y X Y
0 9 4 8 2 17 6
1 3 7 0 8 3 15
2 2 7 0 7 2 14
3 3 8 4 8 7 16
4 8 2 3 6 11 8