Search code examples
pandasdataframepivotstacktransform

Split Dataframe and stack horizontally


I have the following DataFrame:

import pandas as pd
data = {
   "cat": ['A'] * 3 + ['B'] * 2,   
   "val1": [10, 9, 12, 20, 25],
   "val2": [1, 3, 2, 6, 7],
   "idx": [0, 1, 5, 1, 2]
}

df = pd.DataFrame(data)
df.set_index('idx', inplace=True)

which gives

      cat     val1   val2
idx         
0       A       10      1
1       A        9      3
5       A       12      2
1       B       20      6
2       B       25      7

I want to transform it into a DataFrame with a column MultiIndex

        A              B
     val1   val2    val1    val2
idx         
0      10      1      NA      NA
1       9      3      20       6
2      NA     NA      25       7
5      12      2      NA      NA

Is there a way to do this without manually splitting and horizontally concatenating the table?


Solution

  • You can try with df.pivot, swaplevel, and sort_index. Add df.rename_axis to get rid of the added column name (i.e. 'cat').

    out = (
        df.pivot(columns='cat')
        .swaplevel(0, 1, axis=1)
        .sort_index(axis=1, level=0)
        .rename_axis(columns=(None, None))
    )
    
    out
    
            A          B     
         val1 val2  val1 val2
    idx                      
    0    10.0  1.0   NaN  NaN
    1     9.0  3.0  20.0  6.0
    2     NaN  NaN  25.0  7.0
    5    12.0  2.0   NaN  NaN
    

    If idx is not yet your index from the start, you can skip df.set_index('idx', inplace=True) and simply use: df.pivot(index='idx', columns='cat').