Search code examples
pythonstringpandasunique

How to extract and sum unique words from a pandas DataFrame


Consider the following DataFrame:

df = pd.DataFrame({'animals': [['dog','cat','snake','lion','tiger'], 
                  ['dog','moose','alligator','lion','tiger'], 
                  ['eagle','moose','alligator','lion','tiger'],
                  ['cat','alligator','lion']]})

I need to extract every single unique animal and sum the number of occurrences. The output should be something like:

dog             2  
cat             2  
snake           1  
lion            4  
tiger           3  
moose           2  
alligator       3  
eagle           1 

Similar to what df.value_counts() does.

Much appreciated.


Solution

  • You can use explode and value_counts:

    df.animals.explode().value_counts()
    

    Output:

    lion         4
    tiger        3
    alligator    3
    moose        2
    cat          2
    dog          2
    eagle        1
    snake        1
    Name: animals, dtype: int64