Search code examples
sqlpostgresqlsqlalchemyrelational-division

Counting with multiple joins


Assuming you have 3 tables: posts, tags, post_tag. There's an m:n relationship between the posts and tags, post_tag is used to connect them.

How would you count the number of posts that have a particular pair of tags? For instance, counting the number of posts that are tagged with both "news" and "featured"?


Solution

  • This is a case of relational division. There are many ways to solve it.
    Assuming a standard many-to-many implementation, this should be among the fastest and simplest:

    SELECT count(*)
    FROM   post_tag pt1
    JOIN   post_tag pt2 USING (post_id)
    WHERE  pt1.tag_id = (SELECT tag_id FROM tags WHERE tag = 'news')
    AND    pt2.tag_id = (SELECT tag_id FROM tags WHERE tag = 'featured');
    

    That's counting posts with at least the two given tags. There can be more tags for the same post.

    We have assembled an arsenal of techniques under this related question: