Have found an inefficient query in our system. content
holds versions of slides, and this is supposed to select the highest version of a slide by id.
SELECT `content`.*
FROM (`content`)
JOIN (
SELECT max(version) as `version` from `content`
WHERE `slide_id` = '16901'
group by `slide_id`
) c ON `c`.`version` = `content`.`version`;
EXPLAIN
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
| 1 | PRIMARY | <derived2> | NULL | system | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
| 1 | PRIMARY | content | NULL | ref | PRIMARY,version | PRIMARY | 8 | const | 9703 | 100.00 | NULL |
| 2 | DERIVED | content | NULL | ref | PRIMARY,fk_content_slides_idx,thumbnail_asset_id,version,slide_id | fk_content_slides_idx | 8 | const | 1 | 100.00 | Using where; Using index |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
One big issue is that it returns almost all the slides in the system as the outer query does not filter by slide id. After adding that I get...
SELECT `content`.*
FROM (`content`)
JOIN (
SELECT max(version) as `version` from `content`
WHERE `slide_id` = '16901' group by `slide_id`
) c ON `c`.`version` = `content`.`version`
WHERE `slide_id` = '16901';
EXPLAIN
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
| 1 | PRIMARY | <derived2> | NULL | system | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
| 1 | PRIMARY | content | NULL | const | PRIMARY,fk_content_slides_idx,version,slide_id | PRIMARY | 16 | const,const | 1 | 100.00 | NULL |
| 2 | DERIVED | content | NULL | ref | PRIMARY,fk_content_slides_idx,thumbnail_asset_id,version,slide_id | fk_content_slides_idx | 8 | const | 1 | 100.00 | Using where; Using index |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
That reduces the amount of rows down to one correctly, but doesnt really speed things up.
There are indexes on version, slide_id and a unique key on version AND slide_id.
Is there anything else I can do to speed this up?
Use a TOP LIMIT 1 insetead of Max ?
m
MySQL seems to take an index (version, slide_id) to join the tables. You should get a better result with
SELECT `content`.*
FROM `content`
FORCE INDEX FOR JOIN (fk_content_slides_idx)
join (
SELECT `slide_id`, max(version) as `version` from `content`
WHERE `slide_id` = '16901' group by `slide_id`
) c ON `c`.`slide_id` = `content`.`slide_id` and `c`.`version` = `content`.`version`
You need an index that has slide_id as first column, I just guessed that's fk_content_slides_idx
, if not, take another one.
The part FORCE INDEX FOR JOIN (fk_content_slides_idx)
is just to enforce it, you should try if mysql takes it by itself without forcing (it should).
You might get even a slightly better result with an index (slide_id, version), it depends on the amount of data (e.g. the number of versions per id) if you see a difference (but you should not spam indexes, and you already have a lot on this table, but you can try it for fun.)