Search code examples
pythonnlpsentiment-analysiskeyerror

Receiving Key Error = 0 while calculating the polarity in Python


I have two columns - text and title for news articles.

Data looks fine, apologize for a printscreen, just to show the structure.

But it gives me a weird error when I try to calculate the polarity.

# Create
polarity = []

# Creare for loop for Text column only
for i in range(len(jordan_df['text'])):
    polarity.append(TextBlob(jordan_df['text'][i]).sentiment.polarity)

# Put data together    
polarity_data = {'article_text':jordan_df['text'], 'article_polarity': polarity}

The weird thing that this code works, when I change jordan_df to some_df with the same structure.

Error:

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, 
tolerance)
2897             try:
-> 2898                 return self._engine.get_loc(casted_key)
2899             except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

**KeyError: 0**

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
3 frames
<ipython-input-186-edab50678cab> in <module>()
  9 # Creare for loop for Text column only
 10 for i in range(len(jordan_df['text'])):
---> 11     polarity.append(TextBlob(jordan_df['text'][i]).sentiment.polarity)
 12 
 13 # Put data together

/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in __getitem__(self, key)
880 
881         elif key_is_scalar:
--> 882             return self._get_value(key)
883 
884         if is_hashable(key):

/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in _get_value(self, label, takeable)
988 
989         # Similar to Index.get_value, but we do not fall back to positional
--> 990         loc = self.index.get_loc(label)
991         return self.index._get_values_for_loc(self, loc, label)
992 

/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, 
tolerance)
2898                 return self._engine.get_loc(casted_key)
2899             except KeyError as err:
-> 2900                 raise KeyError(key) from err
2901 
2902         if tolerance is not None:

Solution

  • Add this line in your code:

    polarity = []
    
    jordan_df.reset_index(drop=True,inplace = True)  #add this line
    
    
    # Creare for loop for Text column only
    for i in range(len(jordan_df['text'])):
        polarity.append(TextBlob(jordan_df['text'][i]).sentiment.polarity)
    
    # Put data together    
    polarity_data = {'article_text':jordan_df['text'], 'article_polarity': polarity}
    

    You have probably filtered out result, which have changed the index in your jordan_df. You can see in head() of your jordan_df that the index starts with 7.

    And that's why you get KeyError on Key 0

    i.e. when i=0 in jordan_df['text'][i]