Search code examples
pythonpandasdataframecomparison

Add an additional column to a panda dataframe comparing two columns


I have a dataframe (df) containing two columns:

Column 1 Column 2
Apple Banana
Chicken Chicken
Dragonfruit Egg
Fish Fish

What I want to do is create a third column that says whether the results in each column are the same. For instance:

Column 1 Column 2 Same
Apple Banana No
Chicken Chicken Yes
Dragonfruit Egg No
Fish Fish Yes

I've tried: df['Same'] = df.apply(lambda row: row['Column A'] in row['Column B'],axis=1)

Which didn't work. I also tried to create a for loop but couldn't even get close to it working.

Any help you can provide would be much appreciated!


Solution

  • You can simply use np.where :

    import numpy as np
    
    df['Same'] = np.where(df['Column 1'] == df['Column 2'], 'Yes', 'No')
    

    >>> print(df)

    enter image description here