Search code examples
mysqlsqlperformanceoptimizationdatabase-optimization

SQL Query Optimization When using Multiple Joins and Large Record Set


I am making a message board and I trying to retrieve regular topics (ie, topics that are not stickied) and sort them by the date of the last posted message. I am able to accomplish this however when I have about 10,000 messages and 1500 topics the query time is >60 seconds.

My question is, is there anything I can do to my query to increase performance or is my design fundamentally flawed?

Here is the query that I am using.

SELECT Messages.topic_id,
       Messages.posted,
       Topics.title,
       Topics.user_id,
       Users.username
FROM Messages
LEFT JOIN
  Topics USING(topic_id)
LEFT JOIN
   Users on Users.user_id = Topics.user_id
WHERE Messages.message_id IN (
    SELECT MAX(message_id)
    FROM Messages
    GROUP BY topic_id)
AND Messages.topic_id
NOT IN (
    SELECT topic_id
    FROM StickiedTopics)
AND Messages.posted IN (                           
    SELECT MIN(posted)
    FROM Messages 
    GROUP BY message_id)
AND Topics.board_id=1
ORDER BY Messages.posted DESC LIMIT 50

Edit Here is the Explain Plan

+----+--------------------+----------------+----------------+------------------+----------+---------+-------------------------+------+----------------------------------------------+
| id | select_type        | table          | type           | possible_keys    | key      | key_len | ref                     | rows | Extra                                        |
+----+--------------------+----------------+----------------+------------------+----------+---------+-------------------------+------+----------------------------------------------+
|  1 | PRIMARY            | Topics         | ref            | PRIMARY,board_id | board_id | 4       | const                   |  641 | Using where; Using temporary; Using filesort |
|  1 | PRIMARY            | Users          | eq_ref         | PRIMARY          | PRIMARY  | 4       | spergs3.Topics.user_id  |    1 |                                               |
|  1 | PRIMARY            | Messages       | ref            | topic_id         | topic_id | 4       | spergs3.Topics.topic_id |    3 | Using where                                   |
|  4 | DEPENDENT SUBQUERY | Messages       | index          | NULL             | PRIMARY  | 8       | NULL                    |    1 |                                              |
|  3 | DEPENDENT SUBQUERY | StickiedTopics | index_subquery | topic_id         | topic_id | 4       | func                    |    1 | Using index                                  |
|  2 | DEPENDENT SUBQUERY | Messages       | index          | NULL             | topic_id | 4       | NULL                    |    3 | Using index                                  |
+----+--------------------+----------------+----------------+------------------+----------+---------+-------------------------+------+----------------------------------------------+

Indexes

+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table    | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Messages |          0 | PRIMARY  |            1 | message_id  | A         |        9956 |     NULL | NULL   |      | BTREE      |         |
| Messages |          0 | PRIMARY  |            2 | revision_no | A         |        9956 |     NULL | NULL   |      | BTREE      |         |
| Messages |          1 | user_id  |            1 | user_id     | A         |         432 |     NULL | NULL   |      | BTREE      |         |
| Messages |          1 | topic_id |            1 | topic_id    | A         |        3318 |     NULL | NULL   |      | BTREE      |         |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Topics |          0 | PRIMARY  |            1 | topic_id    | A         |        1205 |     NULL | NULL   |      | BTREE      |         |
| Topics |          1 | user_id  |            1 | user_id     | A         |         133 |     NULL | NULL   |      | BTREE      |         |
| Topics |          1 | board_id |            1 | board_id    | A         |           1 |     NULL | NULL   |      | BTREE      |         |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name        | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Users |          0 | PRIMARY         |            1 | user_id     | A         |        2051 |     NULL | NULL   |      | BTREE      |         |
| Users |          0 | username_UNIQUE |            1 | username    | A         |        2051 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

Solution

  • I would start with the first basis of qualified topics, get those IDs, then join out after. My inner first query does a pre-qualify grouped by topic_id and max message to just get distinct IDs pre-qualified. I've also applied a LEFT JOIN to the stickiesTopics too. Why? By doing a left-join, I can look for those that are FOUND (those you want to exclude). So I've applied a WHERE clause for Stickies topic ID is NULL (ie: NOT found). So by doing this, we've ALREADY paired down the list SIGNIFICANTLY without doing several nested sub-queries. From THAT result, we can join to the messages, topics (including qualifier of board_id = 1), users and get parts as needed. Finally, apply a single WHERE IN sub-select for your MIN(posted) qualifier. Don't understand the basis of that, but left it in as part of your original query. Then the order by and limit.

    SELECT STRAIGHT_JOIN
          M.topic_id,
          M.posted,
          T.title,
          T.user_id,
          U.username
       FROM 
          ( select 
                  M1.Topic_ID, 
                  MAX( M1.Message_id ) MaxMsgPerTopic
               from 
                  Messages M1
                     LEFT Join StickiedTopics ST
                        ON M1.Topic_ID = ST.Topic_ID
               where
                  ST.Topic_ID IS NULL
               group by 
                  M1.Topic_ID ) PreQuery
            JOIN Messages M
               ON PreQuery.MaxMsgPerTopic = M.Message_ID
               JOIN Topics T
                   ON M.Topic_ID = T.Topic_ID
                  AND T.Board_ID = 1
                  LEFT JOIN Users U
                     on T.User_ID = U.user_id 
       WHERE
          M.posted IN ( SELECT MIN(posted)
                           FROM Messages 
                           GROUP BY message_id)
       ORDER BY 
          M.posted DESC 
       LIMIT 50