Search code examples
pythonpandasdataframeunpivot

Missing index when unpivotting / melt() Pandas DataFrame with MultiIndex columns and rows


I want to "flatten" an existing Dataframe and came across the Pandas melt() command. This seems to be the weapon of choice here, but the behaviour is a bit unexpected (at least to me). Let's start with a fairly innocent MultiIndex DataFrame:

df = pd.DataFrame(np.random.randn(6, 6),
                  index=pd.MultiIndex.from_arrays([['X','X','X','Y','Y','Y'],
                                                   ['x','y','z','x','y','z']], 
                                                  names=['omega1', 'omega2']),
                  columns=pd.MultiIndex.from_arrays([['A','A','A','B','B','B'],
                                                     ['a','b','c','a','b','c']], 
                                                   names=['alpha1', 'alpha2']))

Gives a nice DataFrame like:

alpha1                A              ...            B          
alpha2                a         b    ...            b         c
omega1 omega2                        ...                       
X      x       2.362954  0.015595    ...     1.273841 -0.632132
       y      -0.134122  1.791614    ...     1.101646 -0.181099
       z       0.410267  1.063625    ...    -1.483590  0.521431
Y      x       0.001779 -0.076198    ...    -1.395494  1.177853
       y       0.453172  1.899883    ...     1.116654 -2.209697
       z       1.636227 -0.999949    ...     0.800413 -0.431485

When I now do df.melt(), I get something like this:

   alpha1 alpha2     value
0       A      a  2.362954
1       A      a -0.134122
2       A      a  0.410267
3       A      a  0.001779
...
33      B      c  1.177853
34      B      c -2.209697
35      B      c -0.431485

However I am more expecting something like this:

  omega1 omega2 alpha1 alpha2     value
0      X      x      A      a  2.362954
1      X      y      A      a -0.134122
2      X      z      A      a  0.410267
3      Y      x      A      a  0.001779
...
33     Y      x      B      c  1.177853
34     Y      y      B      c -2.209697
35     Y      z      B      c -0.431485

The exact order does not matter, but it would be nice if column and row names remained intact. I can't get Pandas to properly return the index with it. What am I doing wrong??


Solution

  • You need to reset_index the index ,and pass ids in melt with the index name

    df.reset_index().melt(['omega1','omega2'])