Search code examples
pandasjoinleft-join

Left Join and Anti-Join on same data frames Pandas


I have 2 dataframes like these:

df1 = pd.DataFrame(data = {'col1' : ['finance',  'accounting'], 'col2' : ['f1', 'a1']}) 
df2 = pd.DataFrame(data = {'col1' : ['finance', 'finance', 'finance', 'accounting', 'accounting','IT','IT'], 'col2' : ['f1','f2','f3','a1,'a2','I1','I2']})

df1

    col1        col2
0   finance      f1
1   accounting   a1

df2

    col1        col2
0   finance      f1
1   finance      f2
2   finance      f3
3   accounting   a1 
4   accounting   a2
5   IT           I1
6   IT           I2

I would like to do LEFT JOIN on col1 and ANTI-JOIN on col2. The output should look like this:

    col1      col2
   finance     f2
   finance     f3
   accounting  a2 

Could someone please help me how to do it properly in pandas. I tried both join and merge in pandas but it hasn't worked for me. Thanks in advance.


Solution

  • You can merge and filter:

    (df1.merge(df2, on='col1', suffixes=('_', None))
       .loc[lambda d: d['col2'] != d.pop('col2_')]
     )
    

    Output:

             col1 col2
    1     finance   f2
    2     finance   f3
    4  accounting   a2