Search code examples
pythonpandasseriesbinary-operators

Pandas: eror when checking for a binary flag pattern


I have a dataframe where one of the columns of type int is storing a binary flag pattern:

import pandas as pd

df = pd.DataFrame({'flag': [1, 2, 4, 5, 7, 3, 9, 11]})

I tried selecting rows with value matching 4 the way it is typically done (with binary and operator):

df[df['flag'] & 4]

But it failed with:

KeyError: "None of [Int64Index([0, 0, 4, 4, 4, 0, 0, 0], dtype='int64')] are in the [columns]"

How to actually select rows matching binary pattern?


Solution

  • Even though in Pandas & or | is used as a logical operator to specify conditions but at the same time using a Series as an argument to allegedly logical operator results not in a Series of Boolean values but numbers.

    Knowing that you can use any of the following approaches to select rows based on a binary pattern:

    • Since result of <int> & <FLAG> is always <FLAG> then you can use:

      df[df['flag'] & 4 == 4]
      

    which (due to the precedence of operators) evaluates as:

      df[(df['flag'] & 4) == 4]
    
    • alternatively you can use apply and map the result directly to a bool:

      df[df['flag'].apply(lambda v: bool(v & FLAG))]
      

    But this does look very cumbersome and is likely to be much slower.

    In either cases, the result is as expected:

        flag
    2   4
    3   5
    4   7