I'm trying to count the words in a Dataframe column consisting of speeches. I have created a lists with words associated with different themes, for example:
Care = [safe, peace, compassion, empath, care, caring, protect, shield, shelter]
Now i would like to count how many times, in total, words in the "Care" list occur in each speech, and then add a new column at the end of the df with the count of each row.
I'm using this code right now.
df = df.assign(Care=df['speech'].str.count('|'.join(care)))
But im suspecting that it gives me partial matches aswell. I would like to only get a match when the words match the whole word in my list. Any ideas?
Assuming that the speech is free of punctuation marks, this might work -
df['count'] = df['speech'].apply(lambda x: len([val for val in x.split() if val in Care]))