I have the following data:
tag_id | taggable_id
--------+-------------
1 | 1
2 | 1
3 | 1
4 | 1
1 | 2
2 | 2
5 | 2
6 | 2
7 | 3
8 | 3
9 | 3
10 | 3
And I want to bring all the taggable_ids which are in a group of tag_ids AND in another group AND another... (max 4 groups).
For example:
In a gross way(for the second example), the query is the following:
SELECT taggable_id FROM taggings WHERE tag_id in (1)
INTERSECT
SELECT taggable_id FROM taggings WHERE tag_id in (6)
INTERSECT
SELECT taggable_id FROM taggings WHERE tag_id in (2,8)
I think simplifying it, it looks like:
SELECT taggable_id FROM taggings WHERE tag_id in (1,2,6,8)
GROUP BY taggable_id HAVING COUNT(*)=3
but I'm wondering if can be done in a simpler way. Any thoughts?
This can be cast as a case of relational division. We have assembled an arsenal of query techniques under this related question:
Depending on data distribution and other factors, this may be fastest:
SELECT DISTINCT taggable_id
FROM taggings t1
JOIN taggings t2 USING (taggable_id)
JOIN taggings t3 USING (taggable_id)
WHERE t1.tag_id = 1
AND t2.tag_id = 6
AND t3.tag_id IN (2, 8);
Assuming unique (tag_id, taggable_id)
, DISTINCT
is actually not needed for the example. But it might be necessary with other (list) predicates.
SQL Fiddle (building on @Clodoaldo's, thanks).