Search code examples
pythonpandasmulti-index

How can I operate with MultiIndexed columns in pandas in specific levels?


In pandas with simple columns, you can operate with columns quite straightfowardly. For example, if you have a DataFrame df with numeric columns A and B, you can compute:

df['C']=df['A']+df['B']

Using MultiIndex columns, you can do the same, as long as you operate on all levels of the MultiIndex. For example, if the columns of your DataFrame are (A,X), (A,Y), (B,X), (B,Y), you can write

df[('C','X')]=df[('A','X')]+df[('B','X')]

and the resulting DataFrame will have an additional column, (C,X) which equals df[A,X]+df[B,X].

Which is the most straightforward way to do this on specific levels? More specifically, I'm looking for something like

df['C']=df['B']+df['A']

, which creates two columns (C,X) and (C,Y) which are equal to df[A,X]+df[B,X] and df[A,Y]+df[B,Y], but this is a ValueError.


Solution

  • We can reshape the dataframe to simplify the operations:

    df.stack().eval('C = A + B').unstack()
    

    Alternatively, you can use concat to append the extra level C:

    df.join(pd.concat({'C': df['A'] + df['B']}, axis=1))
    

       A     B      C    
       X  Y  X  Y   X   Y
    0  9  4  8  2  17   6
    1  3  7  0  8   3  15
    2  2  7  0  7   2  14
    3  3  8  4  8   7  16
    4  8  2  3  6  11   8