Search code examples
mysqlsqlwhere-clausehavinghaving-clause

MySQL query to calculate distance between two coordinates but too slow with HAVING clause


I have the following query that calculates the distance between two coordinates. However, I want to select the users that are within the range of n km. Say, for example, I want to select all users that are within a distance of 100km. Therefore, I have to use HAVING clause here. However, I see that the execution is damn too slow. Even with just two records it's returning the results very slowly. I wonder what would happen with a million user records (in future). Therefore, I am seeking an optimization to my current query for efficient and faster running.

SELECT *, 
  111.1111 *
  DEGREES(ACOS(LEAST(1.0, COS(RADIANS(a.latitude)) 
  * COS(RADIANS(b.latitude)) 
  * COS(RADIANS(a.longitude) - RADIANS(b.longitude)) 
  + SIN(RADIANS(a.latitude))
  * SIN(RADIANS(b.latitude))))) AS distance_in_km 
FROM users AS a
JOIN users AS b ON a.id <> b.id
WHERE b.id != :user AND a.id = :user 
HAVING distance_in_km < :maxdist 
LIMIT 30

UPDATE

As suggested by Rick James, I have removed GROUP BY clause and replaced it with AND a.id = :user in WHERE clause. This, as of now, is returning the same result as was with GROUP BY.


Solution

  • The basic answer is that you cannot make your query more efficient. For your approach, you basically need to calculate the distance between all pairs of users, and that is expensive.

    There may be some hacks that you can use. First, you probably don't need reversed pairs, so you can replace a.id <> b.id with a.id < b.id. That will eliminate half the work.

    You can use a where clause to pre-filter the rows. For instance, on most of the earth's surface, points that are more than 2 degrees of latitude or longitude apart are more than 100 km apart. This is not true everywhere. But it is probably good enough for you. That allows you to write:

    where a.latitude between b.latitude - 2 and b.latitude + 2 and
          a.longitude between b.longitude - 2 and b.longitude + 2
    

    That will save much of the trigonometry, if your users are spread far and wide.

    However, the real solution is to use the GIS extensions of MySQL. A place to start learning about that is in the documentation.

    EDIT:

    SELECT *,
       111.1111 *
       DEGREES(ACOS(LEAST(1.0, COS(RADIANS(a.latitude))
             * COS(RADIANS(b.latitude))
             * COS(RADIANS(a.longitude) - RADIANS(b.longitude))
             + SIN(RADIANS(a.latitude))
             * SIN(RADIANS(b.latitude))))) AS distance_in_km
    FROM users a JOIN
         users b
         ON a.id <> b.id
    WHERE a.id = :user AND
          a.latitude between b.latitude - 2 and b.latitude + 2 and
          a.longitude between b.longitude - 2 and b.longitude + 2
    HAVING distance_in_km < 100