Search code examples
pythonpandasjoinmergemulti-index

Pandas merge with MultiIndex for repeated columns


I've got two DataFrames, containing the same information (length, width) about different aspects (left foot, right foot) of the same objects (people).

import pandas as pd

left_feet = pd.DataFrame(
    data={
        "Length": [20, 30, 25],
        "Width": [8, 10, 9]},
    index=[0, 1, 2])

right_feet = pd.DataFrame(
    data={
        "Length": [24, 30],
        "Width": [8, 10]},
    index=[2, 1])

print(left_feet)
   Length  Width
0      20      8
1      30     10
2      25      9

print(right_feet)
   Length  Width
2      24      8
1      30     10

I want to merge these into a single DataFrame, so I do this:

feet = pd.merge(left_feet, right_feet,
         left_index=True, right_index=True,
         suffixes=["_left", "_right"])

print(feet)
   Length_left  Width_left  Length_right  Width_right
1           30          10            30           10
2           25           9            24            8

Working with suffixes is cumbersome however. I would instead like the columns to be a MultiIndex, where the first level contains "left" and "right" and the second level contains "length" and "width".

What's the best way to do this?

Note: similar questions have been asked about concatenating aligned DataFrames, but this question is about a join (i.e. "merge") operation; the rows are not necessarily aligned and there may not always be corresponding rows.


Solution

  • Try concat, with keys parameter and join='inner':

    print(pd.concat([left_feet, right_feet], axis=1, keys=['Left','Right'], join='inner'))
        Left        Right      
      Length Width Length Width
    1     30    10     30    10
    2     25     9     24     8