Search code examples
mysqldatabaseinner-jointemp-tables

MySQL lookup using JOIN or correlated subquery with TEMPORARY TABLE


I have two primary tables, providers’ anddepots`. Each provider has a compulsory first location. Additional locations are provided by adding an additional 'depot'. 'etc' represents the large number of additional vars related to the provider account. Extra table info, and filters, are omitted here for the sake of simplification.

providers
+--------+------------+-----+
| id     | location   | etc |
+--------+------------+-----+
| 1      | POINT(1,1) | ... |
| 2      | POINT(1,2) | ... |
| 3      | POINT(1,3) | ... |
+--------+------------+-----+

depots
+---------+------------+------------+
| depotId | providerId | location   |
+---------+------------+------------+
| 1       | 1          | POINT(2,1) |
| 2       | 1          | POINT(2,2) |
| 3       | 1          | POINT(2,3) |
| 4       | 2          | POINT(2,4) |
| 5       | 2          | POINT(2,5) |
+---------+------------+------------+

A provider may have zero, or multiple, additional depots. These 'locations' are used to calculate the distances of an incoming ‘job’ from each provider. Traditionally I use a UNION to join the providers and depots table to form a single table, which I will refer to as provDeps.

SELECT id, location, 0 AS depotId FROM providers UNION SELECT p.id, d.location d.id AS depotId FROM providers p, depots d

Let's assume this is a view, forgoing efficiency and indexing for now. It hopefully will reduce the visual complexity of the query.

provDeps
+--------+------------+---------+-----+
| id     | location   | depotId | etc |
+--------+------------+---------+-----+
| 1      | POINT(1,1) | 0       | ... |
| 1      | POINT(2,1) | 1       | ... |
| 1      | POINT(2,2) | 2       | ... |
| 1      | POINT(2,3) | 3       | ... |
| 2      | POINT(1,2) | 0       | ... |
| 2      | POINT(2,4) | 4       | ... |
| 2      | POINT(2,5) | 5       | ... |
| 3      | POINT(1,3) | 0       | ... |
+--------+------------+---------+-----+

I then perform an additional lookup using provDeps. The idea here is to calculate the distance of the job to each and every depot. This is performed via a stored procedure.

SELECT loc.*, degToMeter(st_distance(jobLocation, location)) AS distanceToJob FROM provDeps;

+--------+------------+---------+---------------+-----+
| id     | location   | depotId | distanceToJob | etc |
+--------+------------+---------+---------------+-----+
| 1      | POINT(1,1) | 0       | 8234          | ... |
| 1      | POINT(2,1) | 1       | 7334          | ... |
| 1      | POINT(2,2) | 2       | 6434          | ... |
| 1      | POINT(2,3) | 3       | 5534          | ... |
| 2      | POINT(1,2) | 0       | 4634          | ... |
| 2      | POINT(2,4) | 4       | 3734          | ... |
| 2      | POINT(2,5) | 5       | 2834          | ... |
| 3      | POINT(1,3) | 0       | 1934          | ... |
+--------+------------+---------+---------------+-----+

I now need to reduce this list, to retain only the closest depot, grouped by provider id. The results would include every provider, but with only the one depot for each - either the primary location, depot '0', or the ID of the closest depot. This is the desired result :-

+--------+------------+---------+---------------+-----+
| id     | location   | depotId | distanceToJob | etc |
+--------+------------+---------+---------------+-----+
| 1      | POINT(2,3) | 3       | 5534          | ... |
| 2      | POINT(2,5) | 5       | 2834          | ... |
| 3      | POINT(1,3) | 0       | 1934          | ... |
+--------+------------+---------+---------------+-----+

I have attempted this a number of ways, but each way I'm experiencing a different issue. The closest I've gotten to success is by using a temporary table :-

DROP TEMPORARY TABLE IF EXISTS locTemp;
CREATE TEMPORARY TABLE locTemp AS
    SELECT depots.*, st_distance(jobLocation, location) AS distanceToJob
    FROM provDeps

I then tried using a correlated subquery, but this yields errors about trying to access a temporary table twice in a single lookup :-

SELECT * FROM locTemp
WHERE distanceToJob = (SELECT MIN(distanceToJob) FROM locTemp AS lt WHERE lt.id = locTemp.id);

This results in the error "Can't reopen table: 'locTemp'". I have also tried performing a join, but then I get grouping errors, or an inability to access the temporary table itself, from within a subquery :-

SELECT * FROM
(
    SELECT id, depotId, MIN(distanceToJob) as minDist
    FROM locTemp GROUP BY id
) AS res
INNER JOIN locTemp AS lt on lt.id = res.id and lt.minDist = res.distanceToJob;

Any pointers, or better still solutions, will be greatly appreciated! :)


Solution

  • I don't think you necessarily need a temporary table or view here. Your final query looks completely on track here, which I have modified below. The only issues I see are that you select a non aggregate column with GROUP BY and perhaps there is an issue with using the temporary table.

    SELECT
        t1.id, t1.location, t1.depotId,
        degToMeter(st_distance(t1.jobLocation, t1.location)) AS distanceToJob
    FROM provDeps t1
    INNER JOIN
    (
        SELECT
            id,
            MIN(degToMeter(st_distance(t1.jobLocation, t1.location))) AS minDistanceToJob
        FROM provDeps
        GROUP BY id
    ) t2
        ON t1.id = t2.id AND
           degToMeter(st_distance(t1.jobLocation, t1.location)) = t2.minDistanceToJob;