This is kinda weird. I have the next query:
SELECT * , GROUP_CONCAT( x.tag
SEPARATOR ',' ) AS tags
FROM tag AS t, tag AS x, tag_message_rel AS r, message m
INNER JOIN `user` AS u ON m.user_id = u.id
WHERE t.tag
IN (
'kikikiki', 'dsa'
)
AND m.id = r.message_id
AND t.id = r.tag_id
AND x.id = r.tag_id
GROUP BY m.id
HAVING COUNT( * ) >=2
ORDER BY m.created_at DESC
LIMIT 0 , 20
As you can see i use t to join find the messages that i want, on the other side i use x to print the tags of a message. I i erase the line:
AND x.id = r.tag_id
I will get the messages that i want, but tags will have ALL the tags in the tags table separated by coma. If i leave the line there, i will only get those 2 tags. If i use explain i get:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE u system PRIMARY NULL NULL NULL 1 Using temporary; Using filesort
1 SIMPLE t range PRIMARY,tag tag 252 NULL 2 Using where
1 SIMPLE x eq_ref PRIMARY PRIMARY 4 verse.t.id 1
1 SIMPLE r ALL NULL NULL NULL NULL 180 Using where; Using join buffer
1 SIMPLE m eq_ref PRIMARY PRIMARY 4 verse.r.message_id 1 Using where
Now im no expert in this, but i think the problem is that it is refusing to re-join a table in the process of optimizing the query.
What do you think? Any quick fix?
The problem is that you are trying to join to the tag
table twice, but you really need to join to the tag_message_rel
table twice, and from each of these to the respective row in the tag
table.
Think of "table aliases" as referring to a row in a table, not the table itself. That idea helped me to understand complex joins a lot better.
Here's how I'd write that query:
SELECT m.*, u.*, GROUP_CONCAT(DISTINCT x.tag) AS tags
FROM message m
JOIN `user` u ON (u.id = m.user_id)
JOIN tag_message_rel r1 ON (m.id = r1.message_id)
JOIN tag t ON (t.id = r1.tag_id)
JOIN tag_message_rel r2 ON (m.id = r2.message_id)
JOIN tag x ON (x.id = r2.tag_id)
WHERE t.tag IN ('kikikiki', 'dsa')
GROUP BY m.id
HAVING COUNT(DISTINCT t.tag) = 2
ORDER BY m.created_at DESC
LIMIT 0 , 20;
You should develop the habit of using JOIN
syntax consistently. Mixing JOIN
and comma-style joins can cause some subtle problems.
Here's an alternative query, that pulls some of the joins into a non-correlated subquery, so you avoid a Cartesian product between t
and x
, and eliminate the DISTINCT
modifiers in the group functions.
SELECT m.*, u.*, GROUP_CONCAT(x.tag) AS tags
FROM message m
JOIN `user` u ON (u.id = m.user_id)
JOIN tag_message_rel r ON (m.id = r.message_id)
JOIN tag x ON (x.id = r.tag_id)
WHERE m.id = ANY (
SELECT m2.id
FROM message m2
JOIN tag_message_rel r2 ON (m2.id = r2.message_id)
JOIN tag t ON (t.id = r2.tag_id)
WHERE t.tag IN ('kikikiki', 'dsa')
GROUP BY m2.id
HAVING COUNT(t.tag) = 2)
GROUP BY m.id
ORDER BY m.created_at DESC
LIMIT 0 , 20;