Search code examples
mysqlrowsexamine

MySQL Many Rows Examined by Simple Query


There is a table tb_tag_article like

CREATE TABLE `tb_tag_article` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `tag_id` int(16) NOT NULL,
  `article_id` int(16) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `key_tag_id_article_id` (`tag_id`,`article_id`) USING BTREE,
) ENGINE=InnoDB AUTO_INCREMENT=365944 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

When I query like this, the result is 5120.

SELECT count(*) FROM tb_tag WHERE tag_id = 43

But when I explain the query like this

EXPLAIN SELECT count(*) FROM tb_tag WHERE tag_id = 43

examined rows is 13634.

+------+-------------+----------------+------+-----------------------+-----------------------+---------+-------+-------+-------------+
| id   | select_type | table          | type | possible_keys         | key                   | key_len | ref   | rows  | Extra       |
+------+-------------+----------------+------+-----------------------+-----------------------+---------+-------+-------+-------------+
|    1 | SIMPLE      | tb_tag_article | ref  | key_tag_id_article_id | key_tag_id_article_id | 4       | const | 13634 | Using index |
+------+-------------+----------------+------+-----------------------+-----------------------+---------+-------+-------+-------------+

The query use Index but the numbers of examined rows greater than count of real data. What's the problem?


Solution

  • Q: What's the problem?

    A: It doesn't look like there's any problem.

    The value for the "rows" column in the EXPLAIN output is an estimate, not an exact number.

    Ref: http://dev.mysql.com/doc/refman/5.5/en/explain-output.html


    For evaluating the "cost" of each possible access path, the optimizer only needs estimates in order to compare the efficiency of using a range scan operation on index vs. a full scan of all rows in the table. The optimizer doesn't need "exact" counts of the total number rows in the table, or the number of rows that will satisfy a predicate.

    For this simple query, there are only a couple of possible plans that MySQL will consider.

    And that estimate of 13684 isn't that far off from the exact count of rows. It's off by a factor of 2.5, but MySQL is coming up with the right execution plan: using the index, rather than checking every row in the table.

    There is no problem.