I have several million records in the following table:
CREATE TABLE `customers` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`store_id` int(10) unsigned DEFAULT NULL,
`first_name` varchar(64) DEFAULT NULL,
`middle_name` varchar(64) DEFAULT NULL,
`last_name` varchar(64) DEFAULT NULL,
`email` varchar(128) DEFAULT NULL,
`phone` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_store_email` (`store_id`,`email`),
KEY `index_store_phone` (`store_id`,`phone`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
Query #1 takes ~800ms:
SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1;
Query #2 takes ~1.5ms:
SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1 AND `email` IS NULL;
Query #3 takes a whopping 5 seconds:
SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1 AND `email` IS NOT NULL;
Notes:
store_id
s, but every record has one.email
set to null
.I find a few things strange here:
Any thoughts on this seemingly basic question? Feel like I'm missing something simple. Did I sleep through a class in db school?
At times the MySQL query parser guesses wrong when it decides which indices to use. For cases like these the index hints can be useful (http://dev.mysql.com/doc/refman/5.7/en/index-hints.html)
To force the use of an index:
SELECT * FROM table1 USE INDEX (col1_index,col2_index)
WHERE col1=1 AND col2=2 AND col3=3;
To force the use of an index including replacing table scans:
SELECT * FROM table1 FORCE INDEX (col1_index,col2_index)
WHERE col1=1 AND col2=2 AND col3=3;
To ignore a certain index:
SELECT * FROM table1 IGNORE INDEX (col3_index)
WHERE col1=1 AND col2=2 AND col3=3;
To debug which index is being used the EXPLAIN
statement can be used: (https://dev.mysql.com/doc/refman/5.7/en/explain-output.html)
EXPLAIN SELECT * FROM table1
WHERE col1=1 AND col2=2 AND col3=3;