Search code examples
pandasdataframemergelookup

Merge on one column if condition met else merge on another column


So consider these dfs:

data1 = [{'name': 'Muhammad', 'age_x': 20, 'city_x': 'Karachi', 'number' : 1001},
        {'name': 'Ali', 'age_x': 19, 'city_x': 'Lahore', 'number' : 1002},
        {'name': 'Ahmed', 'age_x': 25, 'city_x': 'Islamabad', 'number' : None}]

data2 = [{'name': 'Muhammad', 'age_x': 20, 'city_x': 'Karachi', 'number' : 1001},
        {'name': 'Ali', 'age_x': 19, 'city_x': 'Lahore', 'number' : 1002},
        {'name': 'Ahmed', 'age_x': 25, 'city_x': 'Islamabad', 'number' : None}]
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

Is there a good solution to merge these dfs (or indicate the matches in df1) based on number column if not None, if number column is None, then based on name?


Solution

  • Since you have different types in number/name, an easy option would be to create a custom key with fillna and use it to merge:

    out = pd.merge(df1, df2,
                   left_on=df1['number'].fillna(df1['name']).rename('key'),
                   right_on=df2['number'].fillna(df2['name']).rename('key'))
    

    Output:

        key_0    name_x  age_x_x   city_x_x  number_x    name_y  age_x_y   city_x_y  number_y
    0  1001.0  Muhammad       20    Karachi    1001.0  Muhammad       20    Karachi    1001.0
    1  1002.0       Ali       19     Lahore    1002.0       Ali       19     Lahore    1002.0
    2   Ahmed     Ahmed       25  Islamabad       NaN     Ahmed       25  Islamabad       NaN