Search code examples
pandasjoinsplitcounterstrsplit

Getting frequency of words from a pandas dataframe column


I have a dataframe that has the column cast which contains multiple actors from a movie. How do I count the number of times each actor appears in the dataset This is a snippet of what the column looks like

df['cast'][:3]
0    João Miguel, Bianca Comparato, Michel Gomes, R...
1    Demián Bichir, Héctor Bonilla, Oscar Serrano, ...
2    Tedd Chan, Stella Chung, Henley Hii, Lawrence ...
Name: cast, dtype: object

Can anyone help?


Solution

  • Use the following code snippet to find the count of 'Stella Chung' for example:

    " ".join(df['cast'].values).count('Stella Chung')

    UPDATE:

    Here's an explanation of what is being done:

    • df['cast'].values returns a array containing all individual column values from the column named cast.
    • " ".join(array) joins all the strings in the array together into a single large string
    • string.count(substring) returns the number of time the substring occurs in the main string.