Search code examples
pythonpandasdataframemulti-indexreindex

Pandas df.reindex() doesn't work for level>0 for MultiIndex


I've got two DataFrames A and B, both with a MultiIndex in the columns. B however has the order of the columns switched up. Now I would simply like to get the columns of B in the same order as in A.

The clue is, the order depends on the second level of the MultiIndex. Pandas' reindex() should do just that, but it does only work for me at level 0 as shown in the following

df = pd.DataFrame([[1,2,3],[4,5,6]])
df.columns = pd.MultiIndex.from_arrays([["a","b","c"], ["aa","bb","cc"]])
print(df)

outputs

    a   b   c
    aa  bb  cc
0   1   2   3
1   4   5   6

First level of the MultiIndex:

df = df.reindex(columns=["b","a","c"], level=0)
print(df)

    b   a   c
    bb  aa  cc
0   2   1   3
1   5   4   6

Second level:

df = df.reindex(columns=["bb","aa","cc"], level=1)
df
    a   b   c
    aa  bb  cc
0   1   2   3
1   4   5   6

The function doesn't raise an exception. And I've also tried it with columns=["bb","aa","cc"], level=2 or columns=["bb","aa","cc"], level=0. There it outputs an empty DataFrame, so my implementation seems to be correct, but just nothing happens...

The next best workaround I can think of is changing level 1 to level 0, but that shouldn't be necessary. Looking forward to your ideas!


Solution

  • If you want to reindex both levels, just pass your MultiIndex:

    A = pd.DataFrame([[1,2,3],[4,5,6]])
    A.columns = pd.MultiIndex.from_arrays([["a","b","c"], ["aa","bb","cc"]])
    
    B = pd.DataFrame(columns=pd.MultiIndex.from_arrays([["b","a","c"],
                                                        ["bb","aa","cc"]]),
                     index=[0])
    
    out = A.reindex(columns=B.columns)
    

    Or maybe you just want to sort_index:

    order = ["bb","aa","cc"]
    out = A.sort_index(level=1, axis=1,
                       key=pd.Series({v:k for k,v in enumerate(order)}).get)
    

    Output:

       b  a  c
      bb aa cc
    0  2  1  3
    1  5  4  6