Search code examples
pythonpandascomparison

pandas match/compare multiple columns


I want to compare two pandas-tables by two columns. Consider following example: I would like to get a boolean Series which indicates True ONLY if BOTH conditions match. I tried is.in() without much success. I could either loop over "One" or combine (add) both columns together in both dataframes, but is there some built-in functionality of pandas that would allow for such (nested) comparisons involving two pairs of columns?

tab1 = pd.DataFrame({"One": [1, 1, 2, 3],
                     "Two": ["A", "B", "C", "C"]})

tab2 = pd.DataFrame({"One": [1, 2, 2, 3, 3],
                     "Two": ["A", "A", "B",  "A","C"]})

# Desired Result (New column in tab1): [True, False, False, True]

EDIT: The second table is more like a lookup table. They do not have the same shape. Should have made that clearer.


Solution

  • tab1.eq(tab2).all(1)
    0     True
    1    False
    2    False
    3     True
    dtype: bool
    

    Update

    tab1.merge(tab2,indicator=True,how='left')['_merge'].eq('both')
    0     True
    1    False
    2    False
    3     True
    Name: _merge, dtype: bool