I have an array of lat long coordinates and I am trying to use a KDTree and scipy's query_ball_point
to return all data points within a 1 mile radius of a designated latitude and longitude.
The problem is that query_ball_point
is returning points that are outside of the specified 1 mile radius. Here's my code:
import pandas as pd
import scipy as sp
import geocoder
import pysal as psl
search_list = df['coordinates'].tolist()
tree = psl.cg.KDTree(search_list, distance_metric='Arc', radius=psl.cg.RADIUS_EARTH_MILES)
latlong = (39.698840000000004, -104.975916)
index = tree.query_ball_point(latlong,r=1)
The result is an array of coordinates like the following:
+---------------------------------------+
| coordinates |
+---------------------------------------+
| (39.676973877551, -104.966231826172) |
| (39.6777407534644, -104.988982458831) |
| ... |
+---------------------------------------+
When I try to use the haversine formula to validate these results, I see the first coordinate is 1.6 miles way
from haversine import haversine
haversine((39.676973877551, -104.966231826172),
(39.698840000000004, -104.975916),miles=True)
1.5961362762187963
Pysal doesn't use the haversine function to calculate distance for the query_ball_point method. It uses the pysal.cg.sphere.arcdist function, which is different.
import pysal
from pysal.cg.kdtree import KDTree
locations = [(40.702566, -73.816859),
(40.70546, -73.810708),
(40.709179, -73.820574),
(40.700486, -73.807969),
(40.694624, -73.820593),
(40.695132, -73.820841),
(40.694095, -73.821334),
(40.694165, -73.822368),
(40.695077, -73.822817),
(40.6747769261, -73.8092618174)]
tree = KDTree(locations, distance_metric='Arc', radius=pysal.cg.RADIUS_EARTH_MILES)
current_point = (40.709523, -73.802472)
# get all points within X miles of 'current_point'
indices = tree.query_ball_point(current_point, 1)
for i in indices:
print(locations[i])
There are 3 points within 1 mile
(40.70546, -73.810708)
(40.700486, -73.807969)
(40.6747769261, -73.8092618174)
Not all these points are within 1 mile according to haversine formula:
from haversine import haversine
for i in indices:
print(haversine(current_points, locations[i], miles = True))
0.5146716729994124
0.6875825817591269
2.4269297885659022
But they are within 1 mile according to pysal's arcdist formula using radius of 3958.756 miles:
from pysal.cg.sphere import arcdist
for i in indices:
print(arcdist(current_points, locations[i], 3958.756))
0.5744128196875283
0.4178272122350164
0.8175408580090955