Search code examples
pandasif-statementiterationcomparison

Python: How to compare values in columns in the same row and in the next row?


I have the following dataframe:

Column A    Column B    Column C
LONDON      MADRID      LONDON
LONDON      MADRID      LONDON
LONDON      MADRID      LONDON
LONDON      MADRID      MADRID
LONDON      MADRID      MADRID
LONDON      MADRID      MADRID
ROME        HAMBURG     ROME
ROME        HAMBURG     HAMBURG
ROME        HAMBURG     HAMBURG
ROME        HAMBURG     HAMBURG
LONDON      MADRID      LONDON
LONDON      MADRID      LONDON
LONDON      MADRID      MADRID
LONDON      MADRID      MADRID

I want to apply the logic: if the value in column A equals the value in the same row in column C AND the value in column B equals to the value in the next row in column C, return these two rows. The expected outcome is:

Column A    Column B    Column C
LONDON      MADRID      LONDON
LONDON      MADRID      MADRID
ROME        HAMBURG     ROME
ROME        HAMBURG     HAMBURG
LONDON      MADRID      LONDON
LONDON      MADRID      MADRID

It feels like something simple but I can't manage it. Any help is highly appreciated.


Solution

  • You can first test rows by condition with & for bitwise AND and then chain shifted mask by | for bitwise OR in boolean indexing:

    m = df['Column A'].eq(df['Column C']) & df['Column B'].eq(df['Column C'].shift(-1))
    
    df = df[ m | m.shift(fill_value=False)]
    print (df)
       Column A Column B Column C
    2    LONDON   MADRID   LONDON
    3    LONDON   MADRID   MADRID
    6      ROME  HAMBURG     ROME
    7      ROME  HAMBURG  HAMBURG
    11   LONDON   MADRID   LONDON
    12   LONDON   MADRID   MADRID