Search code examples
pythonpandasmulti-index

Merge Select Columns Dataframe Columns Into a Multi-Index


I have N dataframes, in this case lets use 2 dfs as an example:

df1 = pd.DataFrame([['a', 2], ['b', 4]], columns=['foo', 'bar'])
df2 = pd.DataFrame([['a', 3], ['b', 5]], columns=['foo', 'bar'])

Which produce:

    foo bar
0   a   2
1   b   4
    foo bar
0   a   3
1   b   5

How can I concat or merge them into a multi-index, where the new column level's name is based on some external variable attached to the dfs, Eg I will use the df name as an example here:

        df1 df2
    foo bar bar
0   a   2   3
1   b   4   5

The dataframes are guaranteed to have the same foo values in the same order.


Solution

  • You can use a custom concat:

    dfs = [df1, df2]
    
    out = (pd.concat({f'df{i}': d.set_index('foo')
                      for i, d in enumerate(dfs, start=1)}, axis=1)
             .reset_index()
          )
    

    Output:

      foo df1 df2
          bar bar
    0   a   2   3
    1   b   4   5
    

    If really you need the "foo" at the bottom:

    out = (pd.concat({f'df{i}': d.set_index('foo')
                      for i, d in enumerate(dfs, start=1)}, axis=1)
             .swaplevel(axis=1)
             .reset_index().swaplevel(axis=1)
          )
    

    Output:

          df1 df2
      foo bar bar
    0   a   2   3
    1   b   4   5