Search code examples
pandaskeyerror

Key Error 0 when trying to delete values from CSV


I have a CSV file that looks like this:

Longitude       Latitude    Value
-123.603607     81.377536   0.348
-124.017502     81.387791   0.386
-124.432344     81.397611   0.383
-124.848099     81.406995   0.405
-125.264724     81.415942   --
...            ...         ...

I have a code that is supposed to remove any rows whose latitude/latitude is not within the radius 0.7 lon/lat of the point (-111.55,75.6) by using the Pythagorean theorem. The if function is supposed to remove any row when (-111.55-Longitude)^2+(75.6-Latitude)^2)>(0.7)^2.

import pandas as pd
import numpy
import math
df =pd.read_csv(r"C:\\Users\\tx163s\\Documents\\projectfiles\\values.csv")
drop_indices = []
for row in range(len(df)):
   if ((-111.55-df[row]['Longitude'])**2+(75.6-df[row]['Latitude'])**2) > 0.49:
      drop_indices.append(i)
df.drop(drop_indices, axis=0, inplace=True)
df.to_csv(r"C:\\Users\\tx163s\\Documents\\projectfiles\\values.csv")

However, I keep getting a key error 0. Is it because I'm trying to append into a list? How should I fix this?

KeyError                                  Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, 
tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 0

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-14-1812652bd9f4> in <module>
      2 
      3 for row in range(len(df)):
----> 4    if ((-71.12167-df[row]['Longitude'])**2+(40.98083-df[row]['Latitude'])**2) > 0.0625:
      5       drop_indices.append(i)
      6 df.drop(drop_indices, axis=0, inplace=True)

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
    2925             if self.columns.nlevels > 1:
    2926                 return self._getitem_multilevel(key)
 -> 2927             indexer = self.columns.get_loc(key)
    2928             if is_integer(indexer):
    2929                 indexer = [indexer]

 C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, 
 method, tolerance)
    2657                 return self._engine.get_loc(key)
    2658             except KeyError:
 -> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
    2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
    2661         if indexer.ndim > 1 or indexer.size > 1:

 pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

 pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

 pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

 pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

 KeyError: 0

Solution

  • In your code, change df[row]['Longitude'] to df.iloc[row]['Longitude'], and change drop_indices.append(i) to drop_indices.append(row)

    drop_indices = []
    for row in range(len(df)):
       if ((-111.55-df.iloc[row]['Longitude'])**2+(75.6-df.iloc[row]['Latitude'])**2) > 0.49:
          drop_indices.append(row)
    df.drop(drop_indices, axis=0, inplace=True)
    
    

    However, a better solution is to use pandas operation:

    df = df[((df[['Longitude','Latitude']] - [-111.55, 75.6])**2).sum(axis=1) < 0.7**2]