I want to "flatten" an existing Dataframe and came across the Pandas melt()
command. This seems to be the weapon of choice here, but the behaviour is a bit unexpected (at least to me). Let's start with a fairly innocent MultiIndex DataFrame:
df = pd.DataFrame(np.random.randn(6, 6),
index=pd.MultiIndex.from_arrays([['X','X','X','Y','Y','Y'],
['x','y','z','x','y','z']],
names=['omega1', 'omega2']),
columns=pd.MultiIndex.from_arrays([['A','A','A','B','B','B'],
['a','b','c','a','b','c']],
names=['alpha1', 'alpha2']))
Gives a nice DataFrame like:
alpha1 A ... B
alpha2 a b ... b c
omega1 omega2 ...
X x 2.362954 0.015595 ... 1.273841 -0.632132
y -0.134122 1.791614 ... 1.101646 -0.181099
z 0.410267 1.063625 ... -1.483590 0.521431
Y x 0.001779 -0.076198 ... -1.395494 1.177853
y 0.453172 1.899883 ... 1.116654 -2.209697
z 1.636227 -0.999949 ... 0.800413 -0.431485
When I now do df.melt()
, I get something like this:
alpha1 alpha2 value
0 A a 2.362954
1 A a -0.134122
2 A a 0.410267
3 A a 0.001779
...
33 B c 1.177853
34 B c -2.209697
35 B c -0.431485
However I am more expecting something like this:
omega1 omega2 alpha1 alpha2 value
0 X x A a 2.362954
1 X y A a -0.134122
2 X z A a 0.410267
3 Y x A a 0.001779
...
33 Y x B c 1.177853
34 Y y B c -2.209697
35 Y z B c -0.431485
The exact order does not matter, but it would be nice if column and row names remained intact. I can't get Pandas to properly return the index with it. What am I doing wrong??
You need to reset_index
the index ,and pass ids in melt
with the index name
df.reset_index().melt(['omega1','omega2'])