Search code examples
pythonpandasjoinmergeconcatenation

How to stack two pandas dataframe and not loosing column names?


I have two very different pandas DataFrames with unique column names. And I want to stack them one under another.Nothing else like "union" or "intersection" needed. Like "copy paste" one under another. And I also don't want to lose any column names.

Df1:

  Col1
1  A    C
2  B    D
3  X    Y

DF2:

    Col2 Col3 Col4 Col5
   1  a   c    x    t
   2  b   d    y    j
   3  b   e    z    k

I tried using pd.concat([df1,df2], axis=0, ignore_index = True) but this resulted in something like this:

  Col1
1  A   C      Nan   Nan   Nan   Nan
2  B   D      Nan   Nan   Nan   Nan
3  X   Y      Nan   Nan   Nan   Nan
4  Nan Nan   Col2  Col3  Col4  Col5
5  Nan Nan    a     c      x     t
6  Nan Nan    b     b      y     j
7  Nan Nan    b     e      z     k

Expected Result:

  Col1
1  A     C     Nan    Nan
2  B     D     Nan    Nan
3  X     Y     Nan    Nan
4 Col2  Col3 Col4   Col5
5  a     c     x      t
6  b     b     y      j
7  b     e     z      k

Original DFs

DF1 enter image description here

DF2 enter image description here


Solution

  • Pandas does not allow for multiple unnamed columns (i.e. columns with the empty string as a name) since column names must be unique. That said, I think this gets close to your desired result.

    import pandas as pd
    import numpy as np
    
    df1 = pd.DataFrame({'Col1':[*'ABX'],'':[*'CDY']})
    df2 = pd.DataFrame({'Col2':[*'abb'],'Col3':[*'cde'],'Col4':[*'xyz'],'Col5':[*'tjk']})
    
    mat1 = np.vstack([df1.columns,df1.to_numpy()])
    mat2 = np.vstack([df2.columns,df2.to_numpy()])
    
    df = pd.concat([pd.DataFrame(mat1),pd.DataFrame(mat2)])
    

    The resulting dataframe df is

          0     1     2     3
    0  Col1         NaN   NaN
    1     A     C   NaN   NaN
    2     B     D   NaN   NaN
    3     X     Y   NaN   NaN
    0  Col2  Col3  Col4  Col5
    1     a     c     x     t
    2     b     d     y     j
    3     b     e     z     k