I know there is always a better way to do something, but I'm not sure how? What is the best way to optimize this query? Should I use Joins, a separate query, etc.. I know it's not a complex query.. just trying to expand my knowledge.
Any suggestions would be appreciated!
SELECT
community_threads.id AS thread_id,
community_threads.title AS thread_title,
community_threads.date AS thread_date,
community_threads.author_id AS author_id,
`user`.display_name AS author_name,
`user`.organization AS author_organization,
(SELECT date FROM community_replies replies WHERE replies.thread_id = community_threads.id ORDER BY date DESC LIMIT 1) AS reply_date,
(SELECT count(id) FROM community_replies replies WHERE replies.thread_id = community_threads.id ORDER BY date DESC LIMIT 1) AS total_replies
FROM
community_threads
INNER JOIN `user` ON community_threads.author_id = `user`.id
WHERE
category_id = '1'
ORDER BY
reply_date DESC
LIMIT 0, 5
This can be improved with a JOIN
against a subselect which gets the aggregate COUNT()
per thread_id
and the aggregate MAX(date)
. Instead of evaluating the subselect for each row, the derived tables should be evaluated only once for the entire query and joined against the rest of the rows from community_threads
.
SELECT
community_threads.id AS thread_id,
community_threads.title AS thread_title,
community_threads.date AS thread_date,
community_threads.author_id AS author_id,
`user`.display_name AS author_name,
`user`.organization AS author_organization,
/* From the joined subqueries */
maxdate.date AS reply_date,
threadcount.num AS total_replies
FROM
community_threads
INNER JOIN `user` ON community_threads.author_id = `user`.id
/* JOIN against subqueries to return MAX(date) (same as order by date DESC limit 1) and COUNT(*) from replies */
/* number of replies per thread_id */
INNER JOIN (
SELECT thread_id, COUNT(*) AS num FROM replies GROUP BY thread_id
) threadcount ON community_threads.id = threadcount.thread_id
/* Most recent date per thread_id */
INNER JOIN (
SELECT thread_id, MAX(date) AS date FROM replies GROUP BY thread_id
) maxdate ON community_threads.id = maxdate.thread_id
WHERE
category_id = '1'
ORDER BY
reply_date DESC
LIMIT 0, 5
You may get even better performance if you put the LIMIT 0, 5
inside the reply_date
subquery. That will only pull the most recent 5 in the subquery, and the INNER JOIN
will discard all from community_threads
not matching.
/* I *think* this will work...*/
SELECT
community_threads.id AS thread_id,
community_threads.title AS thread_title,
community_threads.date AS thread_date,
community_threads.author_id AS author_id,
`user`.display_name AS author_name,
`user`.organization AS author_organization,
/* From the joined subqueries */
maxdate.date AS reply_date,
threadcount.num AS total_replies
FROM
community_threads
INNER JOIN `user` ON community_threads.author_id = `user`.id
INNER JOIN (
SELECT thread_id, COUNT(*) AS num FROM replies GROUP BY thread_id
) threadcount ON community_threads.id = threadcount.thread_id
/* LIMIT in this subquery */
INNER JOIN (
SELECT thread_id, MAX(date) AS date FROM replies GROUP BY thread_id ORDER BY date DESC LIMIT 0, 5
) maxdate ON community_threads.id = maxdate.thread_id
WHERE
category_id = '1'
ORDER BY
reply_date DESC