Search code examples
pythonpandasdataframedata-analysis

comparing two Dataframe columns to check if they have same value in python


I have two dataframes,

new1.
      Name       city
 0    sri won    chn
 1    pechi won  pune
 2    Ram won    mum
 0    pec won    kerala

new3
    req
0   pec
1   mut

I tried,

mask=new1.Name.str.contains("|".join(new3.req.values.tolist()))
new1[mask]

I am getting,

 new1[mask]
      Name       city
 1  pechi won    pune
 0  pec won      kerala

As "pechi" contains "pec", it took this valu. but I want the exact match between the values not "contains"

my desired output is,

 new1[mask]
      Name       city
 0  pec won      kerala

Solution

  • You need \b that means "word boundary":

    a = r'\b(' + "|".join(new3.req.values.tolist()) + r')\b'
    print (a)
    \b(pec|mut)\b
    
    mask=new1.Name.str.contains(a)
    df = new1[mask]
    print (df)
          Name    city
    0  pec won  kerala