Search code examples
pythonpandasdataframeduplicatesrow

Remove rows with similar values


I want to remove all rows where Column "a" value = Column "b" value from the DataFrame like this:

    a   b
1 AAA BBB
2 AAA CCC
3 AAA AAA
4 CCC CCC
5 CCC BBB
6 CCC DDD

Desired output:

     a  b
1 AAA BBB
2 AAA CCC
3 CCC BBB
4 CCC DDD

Solution

  • In [93]: df.loc[df.a.ne(df.b)]
    Out[93]:
         a    b
    1  AAA  BBB
    2  AAA  CCC
    5  CCC  BBB
    6  CCC  DDD
    

    keep the rows where "a" values are not equal to the "b" values.