Search code examples
pandasfilteringconditional-operator

What is the correct way to use df.query() with multiple conditions?


Based on the pandas documentation for query, I do not understand whether it is correct to use and/or or &/| in a query statement with multiple conditions.

Is there a situation when using both bitwise and boolean operators might be necessary? Is there a best practice for when to use which?

The documentation states:

For example, the & and | (bitwise) operators have the precedence of their boolean cousins, and and or, but the practical implications are not clear to me.

I have found accepted answers where users use either bitwise or boolean operators, in this case the answer using query contains and in first case and & in second, getting the same results.

Here is an example where the choice of operator does not change the result:

import numpy as np
import pandas as pd

df = pd.DataFrame(data=np.random.randn(5,2), columns=['A','B'])
df['names'] = list('ABCDE')

query1 = df.query("A > -1 and B < 1 or 'B' in names")

query2 = df.query("A > -1 & B < 1 | 'B' in names")

query1.equals(query2)

Thanks for help.


Solution

  • All operations return a boolean (true/false) mask, so it doesn't matter whether you use a bitwise or logical operator. However the result is not the same if your numbers are not 0/1 (True/False):

    >>> 0 & 1  # same as False & True
    0  # False
    
    >>> 0 and 1  # same as False and True
    0  # False
    
    >>> 2 & 3
    2
    
    >>> 2 and 3
    3