Keyerror from new dataframe when plotting

Keyerror : 0 after creating new dataframe and then attempting to plot new dataframe.

Initially, the code facilitated plotting of the original dataframe. A small number of rows (~5 rows) were removed and a new dataframe was created. New dataframe displayed without issue, however upon attempting to plot the new dataframe shows a Keyerror : 0. I have attempted to resolve the issue without success.

The following is the script for the replacing, removal of missing data and new dataframe creation.

df_pre_orderset2_t = df_pre_orderset2.replace(0, np.nan)
df_pre_orderset2_top = df_pre_orderset2_t.dropna()
pd.set_option('display.max_colwidth', None)

df_pre_orderset2_to_10 = df_pre_orderset2_top.head(10)
df_pre_orderset2_top10 = pd.DataFrame(df_pre_orderset2_to_10)
df_pre_orderset2_top10

With the plot script as follows

plt.figure(figsize=(9,7))
ax = plt.gca()
x = df_pre_orderset2_top10['warning_status']
y = df_pre_orderset2_top10['count']
n = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
t = np.arange(10)

plt.title('Warning distribution versus order sets')
plt.ylabel('Warning count by order sets')
plt.xlabel('Warning alerts')
plt.scatter(x, y, c=t, s=100, alpha=1.0, marker='^')
plt.gcf().set_size_inches(13,8)

#scatter labels
for i, txt in enumerate(n):
   ax.annotate(txt, (x[i],y[i]))

plt.show()

This returns an in complete outline of the proposed plot and a keyerror : 0, as below.

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3079             try:
-> 3080                 return self._engine.get_loc(casted_key)
   3081             except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 0

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-161-0cc4009cf4a7> in <module>
     16 #scatter labels
     17 for i, txt in enumerate(n):
---> 18     ax.annotate(txt, (x[i],y[i]))
     19 
     20 plt.show()

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/series.py in __getitem__(self, key)
    851 
    852         elif key_is_scalar:
--> 853             return self._get_value(key)
    854 
    855         if is_hashable(key):

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/series.py in _get_value(self, label, takeable)
    959 
    960         # Similar to Index.get_value, but we do not fall back to positional
--> 961         loc = self.index.get_loc(label)
    962         return self.index._get_values_for_loc(self, loc, label)
    963 

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3080                 return self._engine.get_loc(casted_key)
   3081             except KeyError as err:
-> 3082                 raise KeyError(key) from err
   3083 
   3084         if tolerance is not None:

KeyError: 0

Solution

The line that's failing is ax.annotate(txt, (x[i],y[i])), and it's failing when i=0. Both x and y are Series objects that are columns taken from df_pre_orderset2_top10, so I'm guessing that when you removed rows from that dataframe, the row with 0 as its index was removed. You should be able to verify this by displaying the dataframe.

If this is the case, you can reset the index to that dataframe before you extract the x and y columns. Set drop=True to make sure the old index isn't added to the dataframe as a new column.

df_pre_orderset2_top10.reset_index(drop=True, inplace=True)

That should fix the problem.