Search code examples
pythonpandasmerge

Merge two dataframes with only one column


I would like to merge two dataframes which have only one column say df1 and df2 as below. Expected output dataframe as df3.

import pandas as pd

data1 = [
    'A',
    'B',
    'C']
df1 = pd.DataFrame(data1,columns=['name'])
data2 = [
    'B',
    'C',
    'D']

df2 = pd.DataFrame(data2,columns=['name'])

data3 = [
    ['A',],
    ['B','B'],
    ['C','C'],
    [None,'D']]

df3 = pd.DataFrame(data3,columns=['name_x','name_y'])

print(df1)
print(df2)
print(df3)

Output:

  name
0    A
1    B
2    C
  name
0    B
1    C
2    D
  name_x name_y
0      A   None
1      B      B
2      C      C
3   None      D

Should I use merge to do it or any other way?


Solution

  • Quick hack, you could merge passing one of the keys as Series:

    df1.merge(df2, left_on='name', right_on=df2['name'], how='outer').drop(columns='name')
    

    Output:

      name_x name_y
    0      A    NaN
    1      B      B
    2      C      C
    3    NaN      D
    

    If you don't drop you'll also get a column with the merged values:

    df1.merge(df2, left_on='name', right_on=df2['name'], how='outer')
    
      name name_x name_y
    0    A      A    NaN
    1    B      B      B
    2    C      C      C
    3    D    NaN      D
    

    If you had extra columns in the input, this would give you:

    df1.merge(df2, left_on='name', right_on=df2['name'], how='outer')
    
      name name_x col1 name_y col2
    0    A      A    x    NaN  NaN
    1    B      B    x      B    x
    2    C      C    x      C    x
    3    D    NaN  NaN      D    x
    

    generic method

    If you have more than two inputs to combine, you can generalize with concat:

    dfs = [df1, df2, df1, df2]
    out = pd.concat(
        [
            d.set_axis(d['name']).add_suffix(f'_{i}')
            for i, d in enumerate(dfs, start=1)
        ],
        axis=1,
    ).reset_index(drop=True)
    

    Output:

      name_1 name_2 name_3 name_4
    0      A    NaN      A    NaN
    1      B      B      B      B
    2      C      C      C      C
    3    NaN      D    NaN      D