Equality Comparison with NumPy Instance Invokes `bool`

I have defined a class where its __ge__ method returns an instance of itself, and whose __bool__ method is not allowed to be invoked (similar to a Pandas Series).

Why is X.__bool__ invoked during np.int8(0) <= x, but not for any of the other examples? Who is invoking it? I have read the Data Model docs but I haven’t found my answer there.

import numpy as np
import pandas as pd

class X:
    def __bool__(self):
        print(f"{self}.__bool__")
        assert False
    def __ge__(self, other):
        print(f"{self}.__ge__")
        return X()

x = X()

np.int8(0) <= x

# Console output:
# <__main__.X object at 0x000001BAC70D5C70>.__ge__
# <__main__.X object at 0x000001BAC70D5D90>.__bool__
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
#   File "<stdin>", line 4, in __bool__
# AssertionError

0 <= x

# Console output:
# <__main__.X object at 0x000001BAC70D5C70>.__ge__
# <__main__.X object at 0x000001BAC70D5DF0>

x >= np.int8(0)

# Console output:
# <__main__.X object at 0x000001BAC70D5C70>.__ge__
# <__main__.X object at 0x000001BAC70D5D30>


pd_ge = pd.Series.__ge__
def ge_wrapper(self, other):
    print("pd.Series.__ge__")
    return pd_ge(self, other)

pd.Series.__ge__ = ge_wrapper

pd_bool = pd.Series.__bool__
def bool_wrapper(self):
    print("pd.Series.__bool__")
    return pd_bool(self)

pd.Series.__bool__ = bool_wrapper


np.int8(0) <= pd.Series([1,2,3])

# Console output:
# pd.Series.__ge__
# 0    True
# 1    True
# 2    True
# dtype: bool

Solution

TL;DR

X.__array_priority__ = 1000

The biggest hint is that it works with a pd.Series.

First I tried having X inherit from pd.Series. This worked (i.e. __bool__ no longer called).

To determine whether NumPy is using an isinstance check or duck-typing approach, I removed the explicit inheritance and added (based on this answer):

@property
def __class__(self):
    return pd.Series

The operation no longer worked (i.e. __bool__ was called).

So now I think we can conclude NumPy is using a duck-typing approach. So I checked to see what attributes are being accessed on X.

I added the following to X:

def __getattribute__(self, item):
    print("getattr", item)
    return object.__getattribute__(self, item)

Again instantiating X as x, and invoking np.int8(0) <= x, we get:

getattr __array_priority__
getattr __array_priority__
getattr __array_priority__
getattr __array_struct__
getattr __array_interface__
getattr __array__
getattr __array_prepare__
<__main__.X object at 0x000002022AB5DBE0>.__ge__
<__main__.X object at 0x000002021A73BE50>.__bool__
getattr __array_struct__
getattr __array_interface__
getattr __array__
Traceback (most recent call last):
  File "<stdin>", line 32, in <module>
    np.int8(0) <= x
  File "<stdin>", line 21, in __bool__
    assert False
AssertionError

Ah-ha! What is __array_priority__? Who cares, really. With a little digging, all we need to know is that NDFrame (from which pd.Series inherits) sets this value as 1000.

If we add X.__array_priority__ = 1000, it works! __bool__ is no longer called.

What made this so difficult (I believe) is that the NumPy code didn't show up in the call stack because it is written in C. I could investigate further if I tried out the suggestion here.

Equality Comparison with NumPy Instance Invokes `__bool__`

TL;DR

Equality Comparison with NumPy Instance Invokes `bool`