Search code examples
pythonpandasdataframemergeconcatenation

Pandas - Merging Two Data frames with different index names but same amount of Columns


I'm trying to merge together two Pandas data frames but not able to get the desired result. One data frame has been transposed, and I think for this reason has a header row containing the column indexes. The other has a header row with column names, however each have the same number of columns.

Data Frame 1 (DF1):

+-----------------------------------------------------+
|      NAME      RD1     RD2     RD3     RD4     RD5  |
+-----------------------------------------------------+
| 0     Tom      4       2       1       -       3    |
| 1     Mark     2       2       3       2       1    |
| 2     James    -       1       5       5       4    |
| ...   ...      ...     ...     ...     ...     ...  |
| 30    Ted      2       3       3       3       2    |
+-----------------------------------------------------+

Data Frame 2 (DF2)

+--------------------------------------------------------------------+
|       0      1          2          3          4          5         |
+--------------------------------------------------------------------+
| 0     Desc   Round 1    Round 2    Round 3    Round 4    Round 5   |
| 1     Opp    Hawks      Dolphins   Raptors    Sharks     Bears     |
| 2     Date   02/04      08/04      16/04      24/04      30/04     |
| 3     Venue  MELB       PERTH      MELB       SYD        MELB      |
+--------------------------------------------------------------------+

I've tried using multiple pandas methods (join / append / merge / concat) to join the two data frames without adding any columns. However each method seems to add up both the columns and rows together. I want to be able to do an inner join but I can seem to find a way without having column index numbers. Is there a way I can add a top row to DF1 with the column index numbers the perform an inner join of the two DFs? I have also looked for ways to convert DF into headers instead with no luck as of yet.

The desired final DF would look like:

+--------------------------------------------------------------------+
|       Desc   Round 1    Round 2    Round 3    Round 4    Round 5   |
|       Opp    Hawks      Dolphins   Raptors    Sharks     Bears     |
|       Date   02/04      08/04      16/04      24/04      30/04     |
|       Venue  MELB       PERTH      MELB       SYD        MELB      |
+--------------------------------------------------------------------+
| 0     Tom    4          2          1          -          3         |
| 1     Mark   2          2          3          2          1         |
| 2     James  -          1          5          5          4         |
| ...   ...    ...        ...        ...        ...        ...       |
| 30    Ted    2          3          3          3          2         |
+--------------------------------------------------------------------+

Solution

  • You can use pd.MultiIndex.from_arrays:

    df1.columns = pd.MultiIndex.from_arrays(df2.values)
    print(df1)
    
    # Output
         Desc Round 1  Round 2 Round 3 Round 4 Round 5
          Opp   Hawks Dolphins Raptors  Sharks   Bears
         Date   02/04    08/04   16/04   24/04   30/04
        Venue    MELB    PERTH    MELB     SYD    MELB
    0     Tom       4        2       1       -       3
    1    Mark       2        2       3       2       1
    2   James       -        1       5       5       4
    30    Ted       2        3       3       3       2
    

    You can also use a comprehension to create a tuple of tuple usable as a MultiIndex:

    df1.columns = tuple(tuple(l) for l in df2.T.values)
    print(df1)
    
    # Output
         Desc Round 1  Round 2 Round 3 Round 4 Round 5
          Opp   Hawks Dolphins Raptors  Sharks   Bears
         Date   02/04    08/04   16/04   24/04   30/04
        Venue    MELB    PERTH    MELB     SYD    MELB
    0     Tom       4        2       1       -       3
    1    Mark       2        2       3       2       1
    2   James       -        1       5       5       4
    30    Ted       2        3       3       3       2