I have a table with 20M rows and a query which takes 10 seconds.
select id from entity
where (entity.class_id = 67 and entity.name like '%321%' )
order by id desc
In execution plan there is an index, but it's not really used.
explain extended select id from entity
where (entity.class_id = 67 and entity.name like '%321%' )
order by id desc
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
| 1 | SIMPLE | entity | ref | constraint_unique_class_legacy_id,entity_tag,entity_class_modification_date_int_idx,entity_name_idx | entity_class_modification_date_int_idx | 8 | const | 288440 | 100.00 | Using where; Using filesort |
If I flush status and run this query, handlers show that there was a full scan
Handler_read_next: 20318800
But if I give a hint to use index which was in 'explain extended', then there is no full scan and query finishes in 250ms.
select id from entity
use index (entity_class_modification_date_int_idx)
where (entity.class_id = 67 and entity.name like '%321%' )
order by id desc
Only 166K of entities was scanned
Handler_read_next: 165894
Why do I have to give hint to use index which is already in execution plan?
If I add + 0 to order by, query finishes in 250ms as well.
select id from entity
where (entity.class_id = 67 and entity.name like '%321%' )
order by id + 0 desc
'explain extended' shows the same execution plan in every case, 'analyze' doesn't help.
Table 'entity':
CREATE TABLE `entity` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(4096) COLLATE utf8_bin DEFAULT NULL,
`tag` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`revision` int(11) NOT NULL,
`class_id` bigint(20) NOT NULL,
`legacy_id` bigint(20) DEFAULT NULL,
`description` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`last_modified_by` varchar(64) COLLATE utf8_bin DEFAULT NULL,
`removed` tinyint(1) NOT NULL DEFAULT '0',
`modification_date_int` bigint(20) DEFAULT NULL,
`creation_date_int` bigint(20) DEFAULT NULL,
`created_by` varchar(64) COLLATE utf8_bin DEFAULT NULL,
`ancestor_class_id` bigint(20) NOT NULL,
`acu_id` bigint(20) DEFAULT NULL,
`secured` tinyint(1) DEFAULT '1',
`system_modification_date` bigint(20) DEFAULT NULL,
`archived` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `constraint_unique_class_legacy_id` (`class_id`,`legacy_id`),
UNIQUE KEY `entity_tag` (`class_id`,`tag`),
UNIQUE KEY `class_hierarchy_tag` (`tag`,`ancestor_class_id`),
KEY `entity_legacy_id_idx` (`legacy_id`),
KEY `entity_modification_date_int_idx` (`modification_date_int`),
KEY `entity_class_modification_date_int_idx` (`class_id`,`removed`,`modification_date_int`),
KEY `ancestor_class_id` (`ancestor_class_id`),
KEY `acu_id` (`acu_id`),
KEY `entity_name_idx` (`class_id`,`name`(255)),
KEY `entity_archived_idx` (`archived`),
CONSTRAINT `entity_ibfk_1` FOREIGN KEY (`class_id`) REFERENCES `class` (`id`),
CONSTRAINT `entity_ibfk_2` FOREIGN KEY (`ancestor_class_id`) REFERENCES `class` (`id`),
CONSTRAINT `entity_ibfk_3` FOREIGN KEY (`acu_id`) REFERENCES `acu` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=60382455 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
MySQL version:
SELECT @@version;
+--------------------+
| @@version |
+--------------------+
| 5.6.30-76.3-56-log |
+--------------------+
OK, I partially found an answer: MySQL Workbench which I'm using is implicitly adding 'limit 1000' to queries and it drastically reduce performance even though there are much less rows in response. With limit 'explain extended' shows PRIMARY as a key and this is not a question anymore. If I increase limit to 10000, then query finishes in 250ms. Looks like there is some heuristics in MySQL optimizer which forces it to use PRIMARY index in case of low 'limit'.