I have a dataframe where one of the columns of type int
is storing a binary flag pattern:
import pandas as pd
df = pd.DataFrame({'flag': [1, 2, 4, 5, 7, 3, 9, 11]})
I tried selecting rows with value matching 4 the way it is typically done (with binary and operator):
df[df['flag'] & 4]
But it failed with:
KeyError: "None of [Int64Index([0, 0, 4, 4, 4, 0, 0, 0], dtype='int64')] are in the [columns]"
How to actually select rows matching binary pattern?
Even though in Pandas &
or |
is used as a logical operator to specify conditions but at the same time using a Series as an argument to allegedly logical operator results not in a Series of Boolean values but numbers.
Knowing that you can use any of the following approaches to select rows based on a binary pattern:
Since result of <int> & <FLAG>
is always <FLAG>
then you can use:
df[df['flag'] & 4 == 4]
which (due to the precedence of operators) evaluates as:
df[(df['flag'] & 4) == 4]
alternatively you can use apply
and map the result directly to a bool
:
df[df['flag'].apply(lambda v: bool(v & FLAG))]
But this does look very cumbersome and is likely to be much slower.
In either cases, the result is as expected:
flag
2 4
3 5
4 7