Search code examples
pythonpandashierarchical-data

Create dataframe with hierarchical indices and extra columns from non-hierarchically indexed dataframe


Consider a simple dataframe:

import numpy as np
import pandas as pd

x = pd.DataFrame(np.arange(10).reshape(5,2))
print(x)
   0  1
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9

I would like to create a hierarchically indexed dataframe of the form:

     0         1
     a    b    a    b
0    0  NaN    1  NaN
1    2  NaN    3  NaN
2    4  NaN    5  NaN
3    6  NaN    7  NaN
4    8  NaN    9  NaN

where the 'a' columns correspond to the original dataframe columns and the 'b' columns are blank (or nan). I can certainly create a hierarchically indexed dataframe with all NaNs and loop over the columns of the original dataframe, writing them into the new dataframe. Is there something more compact than that?


Solution

  • you can do with MultiIndex.from_product

    extra_level = ['a', 'b']
    new_cols = pd.MultiIndex.from_product([x.columns, extra_level])
    x.columns = new_cols[::len(x.columns)] # select all the first element of extra_level
    x = x.reindex(columns=new_cols)
    print(x)
       0      1    
       a   b  a   b
    0  0 NaN  1 NaN
    1  2 NaN  3 NaN
    2  4 NaN  5 NaN
    3  6 NaN  7 NaN
    4  8 NaN  9 NaN