Search code examples
sqlmany-to-many

How to select all records referenced by a specified set of other records


I have two tables with a many-to-many association. I'd like to be able to run a single, parameterized, query that returns all rows on one side that is referenced by a specified set (of arbitrary size) of records in the other table.

To illustrate the problem, consider the case (comparable to my own) where the tables are:

  1. Movie
  2. Actor
  3. Movie_Actor linking table

It is fairly trivial to write a query to return all movies that a given actor is in, or even any specific number of actors have co-starred in, e.g.

SELECT * FROM Movie
WHERE Movie.ID IN (
  SELECT Movie_Actor.Movie_ID FROM Movie_Actor
  WHERE Movie_Actor.Actor_ID = ?
  )

Then you can use INTERSECT or multiple statements in the WHERE clause to narrow it further. But these approaches all, as far as I can tell, rely on a predetermined, fixed number of conditionals in the query. Sure, you could assemble such a query in code, but that defeats my purpose. I just want a single query that I can stick the parameters in.

And if I do WHERE Movie_Actor.Actor_ID IN ? then of course I just get a list of all movies any actor in the list has been in.

I can't see how I could query to get a list of all movies that a set of actors have been in, where that set is larger than 0, but its size is otherwise unconstrained.

Is there a way to construct such a query that would accept a set of actor_ids as parameters?


Solution

  • One option is to use multiple exists conditions:

    select m.*
    from movie m
    where 
        exists (select 1 from movie_actor mv where mv.movie_id = m.id and mv.actor_id = ?)
        and exists (select 1 from movie_actor mv where mv.movie_id = m.id and mv.actor_id = ?)
        and ...
    

    That should be a rather efficent approach.

    On the other hand, you can use aggregation - which is shorter to write, but might be a bit slower (depending on your dataset), and requires you to enumerate the columns to keep in movie:

    select m.id, m.name
    from movie m
    inner join movie_actor mv 
        on  mv.movie_id = m.id 
        and mv.actor_id in (?, ?)  -- either one actor or the other
    group by m.id, m.name
    having count(*) = 2            -- both matched