Search code examples
mysqlsqlgeospatialdatabase-performance

MySQL geospatial query - return nearest row


I have a table with the following structure:

id | code | coordinates

The ID is the primary key, code is something our internal application uses and the coordinates column is the point data type.

I need to construct a query that will quickly return the nearest row to a set of coordinates.

For example, if I gave the coordinates 50.8, -1.3 I would expect the query to return the nearest row as quick as possible (or at least faster than my attempt below).

Here is my attempt. I would use it but it takes 0.5 seconds to return the result.

SELECT localities.code
  FROM (
    SELECT glength(LineStringFromWKB( 
            LineString( 
            GeomFromText(astext(coordinates)), 
            GeomFromText(astext( PointFromWKB(POINT(663.422238,10.398996)))))))
      AS distance, code  
FROM locality
) localities
ORDER BY localities.distance ASC
LIMIT 1

Can anyone construct a query that will return results faster than this taking advantage of geospatial extensions?


Solution

  • First, you have a bit of redundancy in your expression for selecting the distance.

    I believe you can use

    glength(LineString(coordinates,POINT(663.422238,10.398996))) AS distance
    

    rather than what you have. That might help a bit with performance, but it won't help much.

    The query as constructed performs a full table scan on your locality table to compute each distance. It then orders the results by distance and returns the first one.

    If you don't have any apriori knowledge of the geometry in question it is hard to do better than that. But if you can place a bounding box around your sample point and restrict your search to the bounding box, then you can use a SPATIAL index to speed things up a lot.

    To use a SPATIAL index you need to be using a MyISAM access method on your table (InnoDB won't work). Then create the index on your coordinates column.

    Then you need to work out the suitable bounding box. Let's say you want to exclude any points at a distance more than 50 from your candidate POINT(663.422238,10.398996).

    What you need to do is this:

    First, make a geometric object that's a diagonal line going across your bounding box, like so:

    GeomFromText(
       CONCAT('LINESTRING(',
              663.422238 - 50,' ', 10.398996 - 50, ',',
              663.422238 + 50,' ', 10.398996 + 50, ')'))
    

    Then use that geometric object with the SPATIAL-index exploiting function MBRContains() (minimum bounding rectangle contains) in a query like this.

    SELECT localities.code
      FROM (
         SELECT glength(LineString(coordinates,POINT(663.422238,10.398996))) AS distance,
                code 
           FROM locality
          WHERE  MBRContains(
                    GeomFromText(
                           CONCAT('LINESTRING(',
                                   663.422238 - 50,' ', 10.398996 - 50, ',',
                                   663.422238 + 50,' ', 10.398996 + 50, ')')),
                    coordinates )
           ) AS localities
    ORDER BY localities.distance ASC
    LIMIT 1
    

    The key is the bounding box. This way of running the query saves time by using the spatial index to choose a hopefully very small subset of the rows in the table to consider.

    I wrote this up in some detail for lat/long work here. http://www.plumislandmedia.net/mysql/using-mysqls-geospatial-extension-location-finder/

    Note that it's sometimes a little fiddly to get the syntax of the GeomFromText stuff right. I hope it's right, but I might have misplaced a comma or something.