There is a table tb_tag_article like
CREATE TABLE `tb_tag_article` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`tag_id` int(16) NOT NULL,
`article_id` int(16) NOT NULL,
PRIMARY KEY (`id`),
KEY `key_tag_id_article_id` (`tag_id`,`article_id`) USING BTREE,
) ENGINE=InnoDB AUTO_INCREMENT=365944 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
When I query like this, the result is 5120.
SELECT count(*) FROM tb_tag WHERE tag_id = 43
But when I explain the query like this
EXPLAIN SELECT count(*) FROM tb_tag WHERE tag_id = 43
examined rows is 13634.
+------+-------------+----------------+------+-----------------------+-----------------------+---------+-------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------------+------+-----------------------+-----------------------+---------+-------+-------+-------------+
| 1 | SIMPLE | tb_tag_article | ref | key_tag_id_article_id | key_tag_id_article_id | 4 | const | 13634 | Using index |
+------+-------------+----------------+------+-----------------------+-----------------------+---------+-------+-------+-------------+
The query use Index but the numbers of examined rows greater than count of real data. What's the problem?
Q: What's the problem?
A: It doesn't look like there's any problem.
The value for the "rows
" column in the EXPLAIN output is an estimate, not an exact number.
Ref: http://dev.mysql.com/doc/refman/5.5/en/explain-output.html
For evaluating the "cost" of each possible access path, the optimizer only needs estimates in order to compare the efficiency of using a range scan operation on index vs. a full scan of all rows in the table. The optimizer doesn't need "exact" counts of the total number rows in the table, or the number of rows that will satisfy a predicate.
For this simple query, there are only a couple of possible plans that MySQL will consider.
And that estimate of 13684 isn't that far off from the exact count of rows. It's off by a factor of 2.5, but MySQL is coming up with the right execution plan: using the index, rather than checking every row in the table.
There is no problem.