Search code examples
mysqlderived-table

how to make mysql derived table use index


MySQL version 5.6.33

I have two tables (files & details) to inner join

table files, has an index called idx_record(record_id)

table details, has an index called idx_end_org_record(end_at, org_id, record_id)

explain
SELECT
    id as file_id,
    record_id,
    recording_path
FROM `files`
INNER JOIN
(
    SELECT `details`.`record_id`
    FROM `details` 
    WHERE `details`.`org_id` IN (6231, 6232, 6233, 6234) 
    AND (`details`.`end_at` BETWEEN '2017-05-31 16:00:00' AND '2017-06-13 16:00:00') 
    ORDER BY end_at desc
) as b
    on `files`.record_id = b.record_id 
WHERE (file_name IS NOT NULL )
LIMIT 30

Output as below

+----+-------------+--------------+-------+--------------------+--------------------+---------+------------------+---------+--------------------------+
| id | select_type | table        | type  | possible_keys      | key                | key_len | ref              | rows    | Extra                    |
+----+-------------+--------------+-------+--------------------+--------------------+---------+------------------+---------+--------------------------+
|  1 | PRIMARY     | <derived2>   | ALL   | NULL               | NULL               | NULL    | NULL             | 3844632 | Using where              |
|  1 | PRIMARY     | files        | ref   | idx_record         | idx_record         | 5       | b.record_id      |       1 | Using where              |
|  2 | DERIVED     | details      | range | idx_end_org_record | idx_end_org_record | 11      | NULL             | 3844632 | Using where; Using index |
+----+-------------+--------------+-------+--------------------+--------------------+---------+------------------+---------+--------------------------+

As you can see, the <derived2> table is not using any index, making this query take almost 1 second to finish.

Any help is appreciated!


Besides, removing WHERE (file_name IS NOT NULL ) in the query doesn't make any difference.

And in mysql 5.7, this problem doesn't even exist, but currently I'm trying to solve this in 5.6.


Solution

  • I don't see the point of the subquery. In particular, the ORDER BY in the subquery is meaningless. If you write the query as a direct join between the two tables, then the optimizer should be able to take advantage of an index on the join column:

    SELECT
    id as file_id,
    record_id,
    recording_path
    FROM files f
    INNER JOIN details d
        ON f.record_id = d.record_id AND
           d.org_id IN (6231, 6232, 6233, 6234) AND
           d.end_at BETWEEN '2017-05-31 16:00:00' AND '2017-06-13 16:00:00'
    WHERE f.file_name IS NOT NULL
    ORDER BY <some column>    -- it doesn't make sense most of the time to use LIMIT
    LIMIT 30                  -- without ORDER BY
    

    Adding an index on record_id, org_id, and end_at in the details table, and on file_name in the files table, should give you some peformance improvements.