Search code examples
pythonpandasnumpybooleantruthiness

Check Truthiness of Values in Pandas Series


In pandas, is it possible to construct a boolean series for indexing that use custom objects?

i.e.

class Test():
    def __init__(self, num):
        self.num = num
    def __bool__(self):
        return self.num == 3

x = Test(2)
y = Test(3)

df = pd.DataFrame({'A':[x,y]})

print(bool(df['A'].iloc[1]))
print(df.where(df['A'] == True))

returns

True
      A
0   NaN
1   NaN

What I'd like would be something like

True
        A
0   False
1    True

Or similar so that I can use .first_valid_index() to grab the first occurrence in a different function.

Is there any way to check the "Truthiness" of an object to construct the new Series?


Solution

  • Don't use ==. map bool instead

    df.where(df['A'].map(bool))
    
                                                  A
    0                                           NaN
    1  <__main__.Test object at 0x000002A70187E6D0>
    

    Or astype(bool)

    df.where(df.astype(bool))
    
                                                  A
    0                                           NaN
    1  <__main__.Test object at 0x000002A70187E6D0>
    

    However, if you define an __eq__

    class Test():
        def __init__(self, num):
            self.num = num
        def __bool__(self):
            return self.num == 3
        def __eq__(self, other):
            if isinstance(other, type(self)):
                return bool(other) == bool(self)
            else:
                try:
                    return type(other)(self) == other
                except:
                    return False
    
    
    x = Test(2)
    y = Test(3)
    
    df = pd.DataFrame({'A':[x,y]})
    
    print(bool(df['A'].iloc[1]))
    print(df.where(df['A'] == True))
    
    True
                                                  A
    0                                           NaN
    1  <__main__.Test object at 0x000002A701897520>