I'm writing a script that uses a scoring algorithm to test for certain criteria in a set of columns, similar to this example:
df = pd.DataFrame({'A':['1','2','3'],'B':['4','5','6',],'C':['7','8','9']})
df
ver1 = (
(df['A'].str.contains('1') |
df['B'].str.contains('5') |
df['C'].str.contains('8'))
)
df.insert (3, "Result1", ver1)
ver2 = (
(df['A'].str.contains('4') |
df['B'].str.contains('7') |
df['C'].str.contains('9')) #& (df.loc[df['Result1'] == False])
)
df.insert (4, "Result2", ver2)
df
In some instances I also need to test against the outcome of the previous result, so I tried to just add that test case onto the end of the test second condition:
df = pd.DataFrame({'A':['1','2','3'],'B':['4','5','6',],'C':['7','8','9']})
df
ver1 = (
(df['A'].str.contains('1') |
df['B'].str.contains('5') |
df['C'].str.contains('8'))
)
df.insert (3, "Result1", ver1)
ver2 = (
(df['A'].str.contains('4') |
df['B'].str.contains('7') |
df['C'].str.contains('9')) & (df.loc[df['Result1'] == False])
)
df.insert (4, "Result2", ver2)
df
I am then getting the TypeError: TypeError: unsupported operand type(s) for &: 'bool' and 'str'
I see there are methods of converting the boolean to string that could be a workaround but that seems messy. Does anyone know of an easy way to test against mixed 'bool' and 'str' criteria?
df.loc[df['Result1'] == False]
returns
A B C Result1 Result2
2 3 6 9 False True
Simply use df['Result1']
as a boolean Series. In other words, replace
df.loc[df['Result1'] == False]
with
(~df['Result1'])
Output:
A B C Result1 Result2
0 1 4 7 True False
1 2 5 8 True False
2 3 6 9 False True