I am hoping I am not creating a duplicate lol, but I spend more than hours looking for something similar to my questions :)
Said that, I have the following input:
foo= {"Brand":["loc doc poc",
"roc top mop",
"loc lot not",
"roc lot tot",
"loc bot sot",
"nap rat sat"] }
word_list=["loc","top","lot"]
df=pd.DataFrame(foo)
2 Desired Outputs
1 Dictionary with the occurrences stored
2 New column containing the number of occurrences for each row
#Outputs:
counter_dic={"loc":3,"top":1,"lot":2}
Brand count
0 loc doc poc 1
1 roc top mop 1
2 loc lot not 2
3 roc lot tot 1
4 toc bot sot 1
5 nap rat sat 0
The only idea that I had:
If you find a similar question, this can be closed obviously.
I checked the following ones
Check If a String Is In A Pandas DataFrame
Here is one potential solution using str.count
to create an interim count DataFrame which will help with both outputs.
df_counts = pd.concat([df['Brand'].str.count(x).rename(x) for x in word_list], axis=1)
Looks like:
loc top lot
0 1 0 0
1 0 1 0
2 1 0 1
3 0 0 1
4 1 0 0
5 0 0 0
1 - Dictionary with the occurrences stored
df_counts.sum().to_dict()
[out]
{'loc': 3, 'top': 1, 'lot': 2}
2 - New column containing the number of occurrences for each row
df['count'] = df_counts.sum(axis=1)
[out]
Brand count
0 loc doc poc 1
1 roc top mop 1
2 loc lot not 2
3 roc lot tot 1
4 loc bot sot 1
5 nap rat sat 0