Search code examples
pythonpandasdataframemulti-index

Reorder Multiindex Pandas Dataframe


I would like to reorder the columns in a dataframe, and keep the underlying values in the right columns.

For example this is the dataframe I have

cols = [ ['Three', 'Two'],['A', 'D', 'C', 'B']]
header = pd.MultiIndex.from_product(cols)
df = pd.DataFrame([[1,4,3,2,5,8,7,6]]*4,index=np.arange(1,5),columns=header)                  
df.loc[:,('One','E')] = 9
df.loc[:,('One','F')] = 10

>>> df

And I would like to change it as follows:

header2 = pd.MultiIndex(levels=[['One', 'Two', 'Three'], ['E', 'F', 'A', 'B', 'C', 'D']],
       labels=[[0, 0, 0, 0, 1, 1, 1, 1, 2, 2], [0, 1, 2, 3, 4, 5, 2, 3, 4, 5]])

df2 = pd.DataFrame([[9,10,1,2,3,4,5,6,7,8]]*4,index=np.arange(1,5), columns=header2)
>>>>df2

Solution

  • First, define a categorical ordering on the top level. Then, call sort_index on the first axis with both levels.

    v = pd.Categorical(df.columns.get_level_values(0), 
                       categories=['One', 'Two', 'Three'], 
                       ordered=True)
    v2 = pd.Categorical(df.columns.get_level_values(1), 
                        categories=['E', 'F', 'C', 'B', 'A', 'D'],
                        ordered=True)
    df.columns = pd.MultiIndex.from_arrays([v, v2]) 
    
    df = df.sort_index(axis=1, level=[0, 1])
    

    df
      One     Two          Three         
        E   F   C  B  A  D     C  B  A  D
    1   9  10   7  6  5  8     3  2  1  4
    2   9  10   7  6  5  8     3  2  1  4
    3   9  10   7  6  5  8     3  2  1  4
    4   9  10   7  6  5  8     3  2  1  4