I'm trying to fetch relative hashtag.
I have many to many relation between hashtag and post table.
For example for hashtag 'Love', I try to get all hashtag where they post has Love
hashtag.
here is my query ( hashtag 67 is for 'Love' )
SELECT hashtag_id, count(hashtag_id) as count
from post_hashtag
where
# where posts has hashtag '67'
post_id in ( SELECT post_id FROM post_hashtag WHERE hashtag_id = 67 )
# remove hashtag 67 from result
and hashtag_id != 67
# group them and sort by count, so the must repeated hashtag is the best relative hashtag
GROUP by hashtag_id
ORDER by count desc
limit 4
I try to optimize my query but I can't optimize it more ( for now it took 2 - 12 second base on number of post )
Is there anyway to optimize it?
Explain query
+----+-------------+-----------------+------------+--------+---------------+---------------------------------------------------+---------+-------------------------------------------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------+------------+--------+---------------+---------------------------------------------------+---------+-------------------------------------------+---------+----------+-------------+
| 1 | SIMPLE | post_hashtag | NULL | index | NULL | fk_np_account_post_has_np_hashtag_np_hashtag1_idx | 4 | NULL | 4623584 | 100.00 | Using index |
| 1 | SIMPLE | hashtag | NULL | eq_ref | PRIMARY | PRIMARY | 4 | graphicj_novin.np_post_hashtag.hashtag_id | 1 | 100.00 | NULL |
+----+-------------+-----------------+------------+--------+---------------+---------------------------------------------------+---------+-------------------------------------------+---------+----------+-------------+
post_hashtag
has these field
post_id,hashtag_id
Both field is foreign key
MySQL often optimizes WHERE IN (SELECT ...)
poorly. Use a JOIN
instead.
SELECT p1.hashtag_id, count(*) as count
from post_hashtag AS p1
JOIN post_hashtag AS p2 ON p1.post_id = p2.post_id
WHERE p1.hashtag_id != 67
AND p2.hashtag_id = 67
GROUP by p1.hashtag_id
ORDER by count desc
limit 4