Search code examples
algorithmperformancearchitecturegeolocationgeospatial

Finding POIs that are near or contain a certain location


I have an application that does the following:

  • Receives a device's location
  • Fetches a route (collection of POIs, or Points of Interest) assigned to that device
  • Determines if the device is near any of the POIs in the route

The route's POIs can be either a point with a radius, in which case it should detect if the device is within the radius of the point; or a polygon, where it should detect if the device is inside of it.

Here is a sample of a route with 3 POIs, two of them are points with different radii, and the other one is a polygon:

https://jsonblob.com/285c86cd-61d5-11e7-ae4c-fd99f61d20b8

My current algorithm is programmed in PHP with a MySQL database. When a device sends a new location, the script loads all the POIs for its route from the database into memory, and then iterates through them. For POIs that are points, it uses the Haversine formula to find if the device is within the POI's radius, and for POIs that are polygons it uses a "point in polygon" algorithm to find if the device is inside of it or not.

I would like to rewrite the algorithm with the goal of using less computing resources than the current one. We receive about 100 locations per second and they each have to be checked against routes that have about 40 POIs on average.

I can use any language and database to do so, which ones would you recommend for the best possible performance?


Solution

  • I'd use a database (e.g., Postgresql) that supports spatial queries.

    That will let you create a spatial index that puts a bounding box around each POI. You can use this to do an initial check to (typically) eliminate the vast majority of POIs that aren't even close to the current position (i.e., where the current position isn't inside their bounding box).

    Then when you've narrowed it down to a few POIs, you can test the few that are lest using roughly the algorithm you are now--but instead of testing 40 POIs per point, you might be testing only 2 or 3.

    Exactly how well this will work will depend heavily upon how close to rectangular your POIs are. Circular is close enough that it tends to give pretty good results.

    Others may depend--for example, a river that runs nearly north and south may work quite well. If you have a river that runs mostly diagonally, it may be worthwhile to break it up into a number of square/rectangular segments instead of treating the whole thing as a single feature, since the latter will create a bounding box with a lot of space that's quite a ways away from the river.