Search code examples
mysqlperformancequery-optimizationdatabase-performance

MySQL query becomes extremely slow when using Order By


I have a messages table with 15 million rows.

Below query returns about 5 million(But limited in 15 rows) records in less than 1 second :

SELECT messages.* FROM messages 
INNER JOIN gateways ON
messages.gateway_id=gateways.id
INNER JOIN orders ON
gateways.order_id=orders.id
WHERE orders.user_id=6500 AND messages.deleted=0
AND messages.type='Out' LIMIT 15;

But when I adding an Order ByidDESC to the end of it, it becomes extremely slow about ~40 seconds :

SELECT messages.* FROM messages 
INNER JOIN gateways ON
messages.gateway_id=gateways.id
INNER JOIN orders ON
gateways.order_id=orders.id
WHERE orders.user_id=6500 AND messages.deleted=0
AND messages.type='Out' ORDER BY messages.id DESC LIMIT 15;

Any help would be great appreciated.


Solution

  • The pattern SELECT lots_of_stuff ORDER BY something LIMIT small_integer is notorious for causing performance trouble. Leaving out ORDER BY something makes the performance trouble go away. Why? Because the pattern with ORDER BY causes the MySQL server to sort a very large number of fairly large rows (5 million in your case), only to discard all but a small number of them. That uses a lot of RAM, CPU, and IO in your server, just to discard most of the work.

    Your best bet is to use a deferred join type of pattern here, where you sort nothing but the message.id values. Use this subquery to do that.

                       SELECT messages.id 
                         FROM messages 
                   INNER JOIN gateways ON messages.gateway_id=gateways.id
                   INNER JOIN orders   ON gateways.order_id=orders.id
                        WHERE orders.user_id=6500
                          AND messages.deleted=0
                          AND messages.type='Out'
                     ORDER BY messages.id DESC
                       LIMIT 15
    

    This will give you a nice little collection of 15 message.id values.

    Your next step is to work on optimizing this subquery. I suggest you try a compound covering index on your messages table, containing the columns (deleted, type, id, gateway_id). This shoud help accelerate it.

    You may need indexes on the other tables as well. You should consider using the EXPLAIN function in MySQL to analyze your performance.

    Finally, use that little collection of messages.id values to get the messages rows you need, like this. (This is the deferred join; you're deferring fetching the whole row until you know which rows you'll need. That way you don't have to ORDER the whole mess.)

    Edit Add a compound index on gateways (order_id, id) to avoid the full table scan of that table. It's not enormous, but this might help a bit.

    SELECT a.*
      FROM messages a
      JOIN (
                       SELECT messages.id 
                         FROM messages 
                   INNER JOIN gateways ON messages.gateway_id=gateways.id
                   INNER JOIN orders   ON gateways.order_id=orders.id
                        WHERE orders.user_id=6500
                          AND messages.deleted=0
                          AND messages.type='Out'
                     ORDER BY messages.id DESC
                       LIMIT 15
           ) b ON a.id = b.id
     ORDER BY a.id DESC