If I have a dataframe with the following layout:
ID# Response
1234 Covid-19 was a disaster for my business
3456 The way you handled this pandemic was awesome
I want to be able to count frequency of specific words from a list.
list=['covid','COVID','Covid-19','pandemic','coronavirus']
In the end I want to generate a dictionary like the following
{covid:0,COVID:0,Covid-19:1,pandemic:1,'coronavirus':0}
Please help I am really stuck on how to code this in python
For each string, find number of matches.
dict((s, df['response'].str.count(s).fillna(0).sum()) for s in list_of_strings)
Note that Series.str.count
takes a regex input. You may want to append (?=\b)
for positive look-ahead word-endings.
Series.str.count
returns NA
when counting NA
, thus, fill with 0. For each string, sum over column.