I have a CSV file that looks like this:
Longitude Latitude Value
-123.603607 81.377536 0.348
-124.017502 81.387791 0.386
-124.432344 81.397611 0.383
-124.848099 81.406995 0.405
-125.264724 81.415942 --
... ... ...
I have a code that is supposed to remove any rows whose latitude/latitude is not within the radius 0.7 lon/lat of the point (-111.55,75.6) by using the Pythagorean theorem. The if function is supposed to remove any row when (-111.55-Longitude)^2+(75.6-Latitude)^2)>(0.7)^2.
import pandas as pd
import numpy
import math
df =pd.read_csv(r"C:\\Users\\tx163s\\Documents\\projectfiles\\values.csv")
drop_indices = []
for row in range(len(df)):
if ((-111.55-df[row]['Longitude'])**2+(75.6-df[row]['Latitude'])**2) > 0.49:
drop_indices.append(i)
df.drop(drop_indices, axis=0, inplace=True)
df.to_csv(r"C:\\Users\\tx163s\\Documents\\projectfiles\\values.csv")
However, I keep getting a key error 0. Is it because I'm trying to append into a list? How should I fix this?
KeyError Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method,
tolerance)
2656 try:
-> 2657 return self._engine.get_loc(key)
2658 except KeyError:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 0
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-14-1812652bd9f4> in <module>
2
3 for row in range(len(df)):
----> 4 if ((-71.12167-df[row]['Longitude'])**2+(40.98083-df[row]['Latitude'])**2) > 0.0625:
5 drop_indices.append(i)
6 df.drop(drop_indices, axis=0, inplace=True)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2925 if self.columns.nlevels > 1:
2926 return self._getitem_multilevel(key)
-> 2927 indexer = self.columns.get_loc(key)
2928 if is_integer(indexer):
2929 indexer = [indexer]
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key,
method, tolerance)
2657 return self._engine.get_loc(key)
2658 except KeyError:
-> 2659 return self._engine.get_loc(self._maybe_cast_indexer(key))
2660 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2661 if indexer.ndim > 1 or indexer.size > 1:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 0
In your code, change df[row]['Longitude']
to df.iloc[row]['Longitude']
, and change drop_indices.append(i)
to drop_indices.append(row)
drop_indices = []
for row in range(len(df)):
if ((-111.55-df.iloc[row]['Longitude'])**2+(75.6-df.iloc[row]['Latitude'])**2) > 0.49:
drop_indices.append(row)
df.drop(drop_indices, axis=0, inplace=True)
However, a better solution is to use pandas operation:
df = df[((df[['Longitude','Latitude']] - [-111.55, 75.6])**2).sum(axis=1) < 0.7**2]