Search code examples
pythonpandasdataframemergeconcatenation

Concatenate two dataframes with different headers


How can I concatenate 2 dataframes with headers of different length? I want to add the headers of the second dataframe as a row.

Below is my df1:

df1 = pd.DataFrame({'A': [1, 2], 'B': [2, 3], 'C': [3, 4]})

df1

   A  B  C
0  1  2  3
1  2  3  4

df2:

df2 = pd.DataFrame({'D': [34, 'e'], 'E': ['rt', 'rt']})

df2

    D   E
0  34  rt
1   e  rt

My expected output:

df = pd.DataFrame({'A': [1, 2,'D',34,'e'], 'B': [2, 3,'E','rt','rt'], 
                   'C': [3, 4,'','','']})

df

    A   B  C
0   1   2  3
1   2   3  4
2   D   E   
3  34  rt   
4   e  rt   

My attempts thus far failed to achieve my expected output:

(1)

df = pd.concat([df1, df2], axis = 0/1)

(2)

df1_header = pd.DataFrame([df2.columns], columns=df1.columns)
df2.columns = df1.columns 
result = pd.concat([df1_header, df1, df2], ignore_index=True)

Solution

  • Here's one approach:

    df = (
        pd.concat(
            [
                df1,
                df2.T.reset_index().T.rename(
                    columns={k: v for k, v in iter(enumerate(df1.columns))}
                    )
            ],
            axis=0,
            ignore_index=True
        )
    )
    

    Output

        A   B  C
    0   1   2  3
    1   2   3  4
    2   D   E   
    3  34  rt   
    4   e  rt   
    

    Explanation

    • Move column names for df2 to the first row with df.T, df.reset_index and again df.T. With the reset, df2 will now have column names [0, 1] instead of ['D', 'E'].
    • Next, we can use a dict comprehension to map the column names from df1 and pass an iterator (iter) to df.rename with the enumerated version of df1.columns.
    • Finally, use pd.concat on axis=0 and ignore the index.
    • Your output has empty strings instead of NaN values. If you'd like to insist on that, add df.fillna('').