I have the following dataframe:
Column A Column B Column C
LONDON MADRID LONDON
LONDON MADRID LONDON
LONDON MADRID LONDON
LONDON MADRID MADRID
LONDON MADRID MADRID
LONDON MADRID MADRID
ROME HAMBURG ROME
ROME HAMBURG HAMBURG
ROME HAMBURG HAMBURG
ROME HAMBURG HAMBURG
LONDON MADRID LONDON
LONDON MADRID LONDON
LONDON MADRID MADRID
LONDON MADRID MADRID
I want to apply the logic: if the value in column A equals the value in the same row in column C AND the value in column B equals to the value in the next row in column C, return these two rows. The expected outcome is:
Column A Column B Column C
LONDON MADRID LONDON
LONDON MADRID MADRID
ROME HAMBURG ROME
ROME HAMBURG HAMBURG
LONDON MADRID LONDON
LONDON MADRID MADRID
It feels like something simple but I can't manage it. Any help is highly appreciated.
You can first test rows by condition with &
for bitwise AND and then chain shifted mask by |
for bitwise OR
in boolean indexing
:
m = df['Column A'].eq(df['Column C']) & df['Column B'].eq(df['Column C'].shift(-1))
df = df[ m | m.shift(fill_value=False)]
print (df)
Column A Column B Column C
2 LONDON MADRID LONDON
3 LONDON MADRID MADRID
6 ROME HAMBURG ROME
7 ROME HAMBURG HAMBURG
11 LONDON MADRID LONDON
12 LONDON MADRID MADRID