How can I concatenate 2 dataframes with headers of different length? I want to add the headers of the second dataframe as a row.
Below is my df1
:
df1 = pd.DataFrame({'A': [1, 2], 'B': [2, 3], 'C': [3, 4]})
df1
A B C
0 1 2 3
1 2 3 4
df2
:
df2 = pd.DataFrame({'D': [34, 'e'], 'E': ['rt', 'rt']})
df2
D E
0 34 rt
1 e rt
My expected output:
df = pd.DataFrame({'A': [1, 2,'D',34,'e'], 'B': [2, 3,'E','rt','rt'],
'C': [3, 4,'','','']})
df
A B C
0 1 2 3
1 2 3 4
2 D E
3 34 rt
4 e rt
My attempts thus far failed to achieve my expected output:
(1)
df = pd.concat([df1, df2], axis = 0/1)
(2)
df1_header = pd.DataFrame([df2.columns], columns=df1.columns)
df2.columns = df1.columns
result = pd.concat([df1_header, df1, df2], ignore_index=True)
Here's one approach:
df = (
pd.concat(
[
df1,
df2.T.reset_index().T.rename(
columns={k: v for k, v in iter(enumerate(df1.columns))}
)
],
axis=0,
ignore_index=True
)
)
Output
A B C
0 1 2 3
1 2 3 4
2 D E
3 34 rt
4 e rt
Explanation
df2
to the first row with df.T
, df.reset_index
and again df.T
. With the reset, df2
will now have column names [0, 1]
instead of ['D', 'E']
.df1
and pass an iterator (iter
) to df.rename
with the enumerated
version of df1.columns
.pd.concat
on axis=0
and ignore the index.NaN
values. If you'd like to insist on that, add df.fillna('')
.