Search code examples
pandasdataframedata-analysissentiment-analysis

Checking the values in a list against Pandas DataFrame


I have a Pandas DataFrame with a list of words and their associated emotions (each can have several emotions attached to it). Something like this:

enter image description here

I have also extracted the tokens of a text, using Spacy, into a list. something like ['study', 'Maths', 'easy', 'great', 'study',...]

In order to match the tokens in the tokenList to the associated emotions in the emotion dataframe (df_lexicon) I have tried the following: `

emotions = []

// adding the token to emotions list if it exists in the emotion dataframe

for i in tokensList:
  if i in df_lexicon['word'].values:
    emotions.append(i)

// printing the row including the word and emotion

for i in emotions:
  print(df_lexicon[df_lexicon['word']==i])

But that gives me:

       word   emotion
10215  ban  negative
       word   emotion
10220  mad    negative
       mad    fear
.
//(and many more)

I don't know how to add the results to a new DataFrame instead of just printing them. Appreciate your help.


Solution

  • You can use .isin() to compare your dataframe with the values in the list:

    s = df_lexicon['word'].isin(tokenList)
    
    new_df = df_lexicon[s]