We need to select the Tests that only use the tags connected to Machine.
Tags are many to many to Tests. (TagTest associative table)
Tags are many to many to Machines. (TagMachine associative table)
Examples:
If a Test has tags [A,B,C] and Machine has [A,B,D] the test should not be chosen because its tags are not a subset of Machine's tags.
If a Test has tags [A,B] and Machine has [A,B,D] the test should be included.
If a Test has no tags it should always be included.
Something like this construct should work:
SELECT *
FROM Test te
WHERE
(SELECT tt.tagId
FROM TagTest tt
WHERE tt.testId = te.Id)
IN
(SELECT tm.tagId
FROM TagMachine tm
WHERE tm.machineId = 123)
However is this kind of query possible? If not, how is it possible to achieve the desired result?
IN()
can't do this by itself. You can make two CTE's and JOIN them together, but it's still a bit tricky.
Instead, let's turn the problem around. Instead of looking for records that match all good tags, we can look for records that are missing any one required tag. From the first example in the question ([A,B,C]
vs [A,B,D]
), we're looking for the TestTag
records with the C
tag. Once we have this information, we can use it in a subquery to exclude all Test
records with an Id
that appeared in those results.
So the first thing to do is use an exclusion join to find TestTag
results where the corresponding TagMachine
record is missing:
SELECT tt.testId, tt.tagId
FROM TestTag tt
LEFT JOIN TagMachine tm ON tm.machineId = 123 AND tm.tagId = tt.tagId
WHERE tm.tagId IS NULL
The presence of any testId
in the results of the above query makes the Test
with that Id
ineligible... but we do want all other Test
records. So now just limit this to DISTINCT testId
and use it as a subquery in any of an exclusion join, NOT IN(), or NOT EXISTS(). Take your pick:
SELECT *
FROM Tests
WHERE Id NOT IN (
--identify tests hat are missing at least one tag
SELECT DISTINCT tt.testId
FROM TestTag tt
LEFT JOIN TagMachine tm ON tm.machineId = 123 AND tm.tagId = tt.tagId
WHERE tm.tagId IS NULL)