I have a table on mysql and two queries whose performances are quite different. I have extracted plans of the queries, but I couldn't fully understand the reason behind the performance difference.
The table:
+-------------+----------------------------------------------+------------------------------------+
| TableA | | |
+-------------+----------------------------------------------+------------------------------------+
| id | int(10) unsigned NOT NULL AUTO_INCREMENT | |
| userId | int(10) | unsigned DEFAULT NULL |
| created | timestamp | NOT NULL DEFAULT CURRENT_TIMESTAMP |
| PRIMARY KEY | id | |
| KEY userId | userId | |
| KEY created | created | |
+-------------+----------------------------------------------+------------------------------------+
Keys/Indices: The primary key on id
field, a key on userId
field ASC
, another key on created
field ASC.
tableA
is a very big table, it contains millions of rows.
The query I run on this table is:
The user with id 1234
has 1.5M records in this table. I want to fetch its latest 100 rows. In order to achieve this, I have 2 different queries:
Query 1:
SELECT * FROM tableA USE INDEX (userId)
WHERE userId=1234 ORDER BY created DESC LIMIT 100;
Query 2:
SELECT * FROM tableA
WHERE userId=1234 ORDER BY id DESC LIMIT 100;
Since id
field of tableA
is auto increment, the condition of being latest is preserved. These 2 queries return the same result. However, there is a huge performance difference.
Query plans are:
+----------+-----------------------------------------------+-------------------------------+------+---------------------------------------+
| Query No | Operation | Params | Raws | Raw desc |
+----------+-----------------------------------------------+-------------------------------+------+---------------------------------------+
| Query 1 | Sort(using file sort) Unique index scan (ref) | table: tableA; index: userId; | 2.5M | Using index condition; Using filesort |
| Query 2 | Unique index scan (ref) | table: tableA; index: userId; | 2.5M | Using where |
+----------+-----------------------------------------------+-------------------------------+------+---------------------------------------+
+--------+-------------+
| | Performance |
+--------+-------------+
| Query1 | 7,5 s |
+--------+-------------+
| Query2 | 741 ms |
+--------+-------------+
I understand that there is a sorting operation on Query 1. In each query, the index used is userId
. But why is sorting not used in Query 2? How does the primary index affect?
Mysql 5.7
Edit: There are more columns on the table, I have extracted them from the table definition above.
Since id field of tableA is auto increment, the condition of being latest is preserved.
That is usually a valid statement.
WHERE userId=1234 ORDER BY created DESC LIMIT 100
needs this 'composite' index: (userId, created)
. With that, it will hit only 100 rows, regardless of the table size or the number of rows for that user.
The same goes for
WHERE userId=1234 ORDER BY id DESC LIMIT 100;
Namely that it needs (userId, id)
. However, in InnoDB, when you say INDEX(x)
it silently tacks on the PRIMARY KEY
columns. So you effectively get INDEX(x,id)
. This is why your plain INDEX(userId)
worked well.
EXPLAIN
rarely (if ever) takes into account the LIMIT
. This is why 'Rows' is "2.5M" for both queries.
The first query might (or might not) have used INDEX(userId)
if you took out the USE INDEX
hint. The choice depends on what percentage of the table has userId = 1234
. If it is less than about 20%, the index would be used. But it would bounce back and forth between the secondary index and the data -- all 1.5 million times. If more than 20%, it would avoid the bouncing by simply reading all the "millions" of rows, ignoring those that don't apply.
Note: What you had for Q1 will still read at least 1.5M rows, sort them ("Using filesort"), then peel off the desired 100. But with INDEX(userId, created)
, it can skip the sort and look at only 100 rows.
I cannot explain "Unique index scan" without seeing SHOW CREATE TABLE
and the un-annotated EXPLAIN
. (EXPLAIN FORMAT=JSON SELECT...
might provide more insight.)