Search code examples
pythonpython-3.xpandasdataframemerge

Merge dataframe using common keys


I've been trying to merge two dataframes so that:

  • data with common keys will get updated => new column added with the additional value
  • data with no common keys will be added as new rows (see sample below)

I have been testing with both merge and concat but I still did not find the right config for them I guess...

Runnable example: https://onecompiler.com/python/3zvfatnj9

# df A
  host  val1  val2
0   aa    11    44
1   bb    22    55
2   cc    33    66

# df B
  host  val1  val3
0   aa    11    77
1   bb    22    88
2   dd     0    99

# df Expected after merge
  host  val1  val2  val3
0   aa    11  44.0   77.0
1   bb    22  55.0   88.0
2   cc    33  66.0   NaN
3   dd     0   NaN  99.0

Solution

  • Another possible solution, using pandas.DataFrame.join:

    cols = ['host', 'val1']
    dfa.set_index(cols).join(dfb.set_index(cols), how='outer').reset_index()
    

    Alternatively, using pandas.DataFrame.combine_first:

    cols = ['host', 'val1']
    dfa.set_index(cols).combine_first(dfb.set_index(cols)).reset_index()
    

    Output:

      host  val1  val2  val3
    0   aa    11  44.0  77.0
    1   bb    22  55.0  88.0
    2   cc    33  66.0   NaN
    3   dd     0   NaN  99.0