Search code examples
pythonpandasjoin

How to join two dataframes for a particular condition for the joining keys


I have two datframes df1 and df2 and I want to join them and create new dataframe df3 .

I want join work even if dest column of df1 has one match in dest in column of df2.

Join key is pair org,dest

df1

Name    org   dest  
Ashok   A      B
Rahul   A      C
Anupa   B      A
Sam     A      B

df2

org  dest   Amount
A       A/B/C   10
B       C       20
A       W       30

Expected Output

Name    org   dest  Amount
Ashok   A      B    10
Rahul   A      C    
Anupa   B      A
Sam     A      B    10

How can this be done in python


Solution

  • split and explode the ids in "dest", then merge:

    out = df1.merge(df2.assign(dest=df2['dest'].str.split('/'))
                       .explode('dest'),
                    on=['org', 'dest'], how='left')
    

    Output:

        Name org dest  Amount
    0  Ashok   A    B    10.0
    1  Rahul   A    C    10.0
    2  Anupa   B    A     NaN
    3    Sam   A    B    10.0
    

    Reproducible input:

    df1 = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]})
    df2 = pd.DataFrame({'C': [1, 2, 2, 3, 3, 3, 4],
                        'D': [9, 10, 11, 12, 13, 14, 15]})