I have two tables with a many-to-many association. I'd like to be able to run a single, parameterized, query that returns all rows on one side that is referenced by a specified set (of arbitrary size) of records in the other table.
To illustrate the problem, consider the case (comparable to my own) where the tables are:
It is fairly trivial to write a query to return all movies that a given actor is in, or even any specific number of actors have co-starred in, e.g.
SELECT * FROM Movie
WHERE Movie.ID IN (
SELECT Movie_Actor.Movie_ID FROM Movie_Actor
WHERE Movie_Actor.Actor_ID = ?
)
Then you can use INTERSECT or multiple statements in the WHERE clause to narrow it further. But these approaches all, as far as I can tell, rely on a predetermined, fixed number of conditionals in the query. Sure, you could assemble such a query in code, but that defeats my purpose. I just want a single query that I can stick the parameters in.
And if I do WHERE Movie_Actor.Actor_ID IN ?
then of course I just get a list of all movies any actor in the list has been in.
I can't see how I could query to get a list of all movies that a set of actors have been in, where that set is larger than 0, but its size is otherwise unconstrained.
Is there a way to construct such a query that would accept a set of actor_ids as parameters?
One option is to use multiple exists
conditions:
select m.*
from movie m
where
exists (select 1 from movie_actor mv where mv.movie_id = m.id and mv.actor_id = ?)
and exists (select 1 from movie_actor mv where mv.movie_id = m.id and mv.actor_id = ?)
and ...
That should be a rather efficent approach.
On the other hand, you can use aggregation - which is shorter to write, but might be a bit slower (depending on your dataset), and requires you to enumerate the columns to keep in movie
:
select m.id, m.name
from movie m
inner join movie_actor mv
on mv.movie_id = m.id
and mv.actor_id in (?, ?) -- either one actor or the other
group by m.id, m.name
having count(*) = 2 -- both matched