I can create a df and then modify it to have a two level column index as follows:
import pandas as pd
import numpy as np
idx = pd.MultiIndex.from_product([['bar', 'baz', 'foo', 'qux'], ['one', 'two', 'three']])
df = pd.DataFrame(np.random.randn(12, 6), index=idx, columns=['C', 'D', 'E', 'F', 'G', 'H'])
print(df, '\n')
df.columns = pd.MultiIndex.from_product([['A'], df.columns])
print(df.head(3))
I get this:
A
C D E F G H
bar one -0.370228 1.246188 0.673553 0.116890 0.129511 0.126562
two -1.059752 -0.357985 -0.189913 1.080814 0.588176 0.212053
three -0.345277 -1.227097 0.915477 1.475285 -1.342885 0.149785
But what I want is this:
A B
C D E F G H
bar one -0.370228 1.246188 0.673553 0.116890 0.129511 0.126562
two -1.059752 -0.357985 -0.189913 1.080814 0.588176 0.212053
three -0.345277 -1.227097 0.915477 1.475285 -1.342885 0.149785
So that my columns are accessed as: AC, AD, AE, BF, BG, BH
I have tried this (and a few other things):
df.columns = pd.MultiIndex.from_product([[['A'], df[['C', 'D', 'E']]], [['B'], df[['F', 'G', 'H']]]])
But I keep getting this error:
TypeError: unhashable type: 'list'
How can I create the multi-index in the way that I desire?
You simply assign a two-dimensional array as the dataframe's columns:
df.columns = [
['A']*3 + ['B']*3,
['C', 'D', 'E', 'F', 'G', 'H']
]
Result:
A B
C D E F G H
bar one -0.682664 0.221460 -0.595096 -0.327291 0.997758 -0.100819
two 0.698674 -0.433741 1.572080 0.849064 0.666364 -0.313989
three -0.597167 0.322044 -0.693410 1.551602 0.718136 -0.881298
baz one 0.701170 -0.164332 0.111689 0.192714 -0.743848 -1.447928
two 0.504745 -0.059556 -1.673365 0.734746 0.586669 1.541826