Join 2 dataframes with same column but different values (Python)

I want to join 2 tables, with same column names, table 1 has data universe with some NA's, table 2 has valid data for which table 1 has NA values.

I want to populate NA value in table 1 based on table 2 using joins.

Edit: Table 2 doesn't contain all columns as Table 1

I tried this

import pandas as pd

data1 = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8,]], columns=['A', 'B', 'C'])
data2 = pd.DataFrame([[7, 8, 9]], columns=['A','B','C'])

data1.merge(data2, on=['A','B'], how='left')

Output I expect:

Output I got:

Solution

If you're always "joining" based on known columns, then you probably want to concat and groupby.first (or last if you have different values and want to give priority to the new one):

out = pd.concat([data1, data2]).groupby(['A', 'B'], as_index=False).first()

Alternatively:

cols = ['A', 'B']

out = (data1.set_index(cols)
       .combine_first(data2.set_index(cols))
       .reset_index()
      )

Or, assuming you have range index in data1 and no duplicates for A/B in data2 (if not this will produce incorrect results), using a merge:

out = data1.combine_first(data1[cols].merge(data2, how='left'))

Output:

   A  B    C
0  1  2  3.0
1  4  5  6.0
2  7  8  9.0