I'm trying to return rows where any of my columns contain any of the words in a word list. Let's say word_list = ['Synthetic', 'Advanced or Advantage/Excellence']
. I've tried the following code df[df.apply(' '.join, 1).str.contains('|'.join(word_list))]
.
The problem is some of my columns contain null values, so after running that code I got the error TypeError: sequence item 0: expected str instance, int found
(maybe Pandas treats the null values as "int" type?)
Is there anyway I can construct my code in a way that Pandas can either ignore the null values, or treat the null values as string, so that my function can work?
The problem is that you are trying to concatenate an int
and a str
, you can try this instead:
df[df.apply(lambda x: x.astype(str).str.contains('|'.join(word_list), case=False).any(), axis=1)]
I've tried this with int/float/NaNs in the columns and worked ok for me.