Search code examples
pythonpandasdataframebooleanlogical-operators

Check cells in pandas columns for boolean + strings and return boolean (TypeError: unsupported operand type(s) for &: 'bool' and 'str')


I'm writing a script that uses a scoring algorithm to test for certain criteria in a set of columns, similar to this example:

df = pd.DataFrame({'A':['1','2','3'],'B':['4','5','6',],'C':['7','8','9']})
df
ver1 = (
    (df['A'].str.contains('1') |
    df['B'].str.contains('5') |
    df['C'].str.contains('8')) 
)
df.insert (3, "Result1", ver1) 

ver2 = (
    (df['A'].str.contains('4') |
    df['B'].str.contains('7') |
    df['C'].str.contains('9')) #& (df.loc[df['Result1'] == False])
)
df.insert (4, "Result2", ver2) 
df

In some instances I also need to test against the outcome of the previous result, so I tried to just add that test case onto the end of the test second condition:


df = pd.DataFrame({'A':['1','2','3'],'B':['4','5','6',],'C':['7','8','9']})
df
ver1 = (
    (df['A'].str.contains('1') |
    df['B'].str.contains('5') |
    df['C'].str.contains('8')) 
)
df.insert (3, "Result1", ver1) 

ver2 = (
    (df['A'].str.contains('4') |
    df['B'].str.contains('7') |
    df['C'].str.contains('9')) & (df.loc[df['Result1'] == False])
)
df.insert (4, "Result2", ver2) 
df

I am then getting the TypeError: TypeError: unsupported operand type(s) for &: 'bool' and 'str'

I see there are methods of converting the boolean to string that could be a workaround but that seems messy. Does anyone know of an easy way to test against mixed 'bool' and 'str' criteria?


Solution

  • df.loc[df['Result1'] == False] returns

       A  B  C  Result1  Result2
    2  3  6  9    False     True
    

    Simply use df['Result1'] as a boolean Series. In other words, replace

    df.loc[df['Result1'] == False]
    

    with

    (~df['Result1'])
    

    Output:

       A  B  C  Result1  Result2
    0  1  4  7     True    False
    1  2  5  8     True    False
    2  3  6  9    False     True