Search code examples
pythondataframemulti-index

Change dataframe from MultiIndex to MultiColumn


I have a dataframe like

multiindex1 = pd.MultiIndex.from_product([['a'], np.arange(3, 8)])
df1 = pd.DataFrame(np.random.randn(5, 3), index=multiindex1)
multiindex2 = pd.MultiIndex.from_product([['s'], np.arange(1, 6)])
df2 = pd.DataFrame(np.random.randn(5, 3), index=multiindex2)
multiindex3 = pd.MultiIndex.from_product([['d'], np.arange(2, 7)])
df3 = pd.DataFrame(np.random.randn(5, 3), index=multiindex3)
df = pd.concat([df1, df2, df3])
>>>
            0         1         2
a 3  0.872208 -0.145098 -0.519966
  4 -0.976089 -0.730808 -1.463151
  5 -0.026869  0.227912  1.525924
  6 -0.161126 -0.415913 -1.211737
  7 -0.893881 -0.769385  0.346436
s 1 -0.972483  0.202820  0.265304
  2  0.007303  0.802974 -0.254106
  3  1.619309 -1.545089  0.161493
  4  2.847376  0.951796 -0.877882
  5  1.749237 -0.327026  0.467784
d 2  1.440793 -0.697371  0.902004
  3  0.390181 -0.449725 -0.462104
  4  0.056916  0.140066  0.918281
  5  0.164234 -2.491176  2.035113
  6 -1.648948  0.372179  0.600297

Now I want to change it into multicolumns like

                              a                             b                             c
          0         1         2         0         1         2         0         1         2
1       Nan       Nan       Nan        ...
2       Nan       Nan       Nan        ...
3  0.872208 -0.145098 -0.519966        ...
4 -0.976089 -0.730808 -1.463151        ...
5 -0.026869  0.227912  1.525924        ...
6 -0.161126 -0.415913 -1.211737        ...
7 -0.893881 -0.769385  0.346436        ...

That's to say I want two goals to be done:

  1. change level0 index(multi index into single index) into level0 columns(single column into multi column)
  2. merge level1 index togather and reindex by it

I've tried stack unstack pivot, but couldn't get the target form. So is there any elegant way to achieve it?


Solution

  • This seemed to work for me. I used to work with someone who was amazing with multi-indices and he taught me a lot of this.

    df.unstack(level=0).reorder_levels([0,1], axis=1).swaplevel(axis=1)[["a", "s", "d"]]
    

    Output:

              a                             s                             d                                                                                                     
              0         1         2         0         1         2         0         1         2                                                                                 
    1       NaN       NaN       NaN  0.206957  1.329804 -0.037481       NaN       NaN       NaN                                                                                 
    2       NaN       NaN       NaN  0.244912  1.880180  1.447138 -1.009454  0.215560  0.126756                                                                                 
    3  0.871496 -1.247274 -0.458660  0.514475  0.989567 -1.653646 -0.623382 -0.799157  0.119259                                                                                 
    4 -0.756771  0.523621  0.067967  1.066499 -1.436044 -1.045745  0.440954 -1.997325 -1.223662                                                                                 
    5  0.707063  1.019831  0.422577  0.964631 -0.034742 -0.891528  0.891096 -0.724778 -0.043314                                                                                 
    6 -0.140548 -0.093853 -1.060963       NaN       NaN       NaN  0.643902 -0.062267 -0.505484                                                                                 
    7 -0.449435  0.360956 -0.769459       NaN       NaN       NaN       NaN       NaN       NaN