Search code examples
mysqlinnodb

2 similar query in MySQL, 2 very different performance, why?


We have a pretty large database where we query a pack of data base on a datetime column. Yesterday we had a problem where we found that a specific query that usually took 4s now took more than 40s.

After some digging and debugging we found the problem.

mysql> explain select count(*) from event where survey_id = 158 and event_datez>'2019-10-30 00:00:00' and event_datez<'2019-11-28 23:59:59' ; # Query takes 4s
+----+-------------+--------------+------------+-------+-----------------------------------------------+------------------+---------+------+---------+----------+------------------------------------+
| id | select_type | table        | partitions | type  | possible_keys                                 | key              | key_len | ref  | rows    | filtered | Extra                              |
+----+-------------+--------------+------------+-------+-----------------------------------------------+------------------+---------+------+---------+----------+------------------------------------+
|  1 | SIMPLE      |        event | NULL       | range | FK_g1lx0ea096nqioytyhtjng72t, i_event_2       | i_event_2        | 6       | NULL | 2975160 |    50.00 | Using index condition; Using where |
+----+-------------+--------------+------------+-------+-----------------------------------------------+------------------+---------+------+---------+----------+------------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select count(*) from event where survey_id = 158 and event_datez>'2019-10-29 00:00:00' and event_datez<'2019-11-28 23:59:59' ; # Query takes 40s
+----+-------------+--------------+------------+------+-----------------------------------------------+------------------------------+---------+-------+----------+----------+-------------+
| id | select_type | table        | partitions | type | possible_keys                                 | key                          | key_len | ref   | rows     | filtered | Extra       |
+----+-------------+--------------+------------+------+-----------------------------------------------+------------------------------+---------+-------+----------+----------+-------------+
|  1 | SIMPLE      | event        | NULL       | ref  | FK_g1lx0ea096nqioytyhtjng72t,i_event_2        | FK_g1lx0ea096nqioytyhtjng72t | 9       | const | 16272884 |    12.23 | Using where |
+----+-------------+--------------+------------+------+-----------------------------------------------+------------------------------+---------+-------+----------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

The problem was that InnoDB changed which index use for the query. My question is simple why this was happening?


Solution

  • In a book's index, why don't they include common words like "the" or "and"? Because it would match every page in the book, and it wouldn't be useful to look up the values in the index. You might as well just read all the pages in the book, cover to cover.

    MySQL will not use an index if it estimates that the condition would match a large percentage of the rows. The precise threshold is not documented, but in my experience it is around 20-25% of the table. Note that MySQL index statistics aren't always perfect either; they are an estimate based on sampled data.

    The range condition on dates is slightly broader in your second query. Therefore it matches more rows. It could be that this was just enough over the threshold, so MySQL decided not to use the i_event_2 index.

    MySQL may also have a slight preference for a query optimization plan that use the type: ref over type: range.

    You could use an index hint to make MySQL consider only the i_event_2 index.

    select count(*) from event USE INDEX (i_event_2)
    where survey_id = 158
      and event_datez>'2019-10-29 00:00:00' 
      and event_datez<'2019-11-28 23:59:59' ;
    

    But I think it would be even better to create a composite index over the two columns:

    ALTER TABLE event ADD INDEX i_event_survey_datez (survey_id, event_datez);