Search code examples
sqlpostgresqlindexingmany-to-manyrelational-division

AND conditions in many-to-many relation


Say I have three tables, a table of users, a table of around 500 different items, and the corresponding join table. What I would like to do is:

select * from users u join items_users iu on iu.user_id = u.id
where iu.item_id in (1,2,3,4,5)
and u.city_id = 1 limit 10;

Except, instead of an IN condition, I would like to find users that have all the corresponding items. If it helps, assume that the max number of items that will be searched for at a time will be 5. Also, I am using Postgres, and don't mind denormalizing it if would help as it's a read only system and speed is highest priority.


Solution

  • It's another case of relational division. We have assembled quite an arsenal of queries to deal with this class of problems here.

    In this case, with 5 or more items, I might try:

    SELECT u.*
    FROM   users AS u
    WHERE  u.city_id = 1
    AND EXISTS (
       SELECT *
       FROM   items_users AS a
       JOIN   items_users AS b USING (user_id)
       JOIN   items_users AS c USING (user_id)
       ...
       WHERE  a.user_id = u.user_id
       AND    a.item_id = 1
       AND    b.item_id = 2
       AND    c.item_id = 3
       ...
       )
    LIMIT 10;
    

    It was among the fastest in my tests and it fits the requirement of multiple criteria on items_users while only returning columns from user.

    Read about indexes at the linked answer. these are crucial for performance. As your tables are read-only I would also CLUSTER both tables, to minimize the number of pages that have to be visited. If nothing else, CLUSTER items_users using a multi-column index on (user_id, item_id).