Search code examples
sqlsql-serverjoinrelational-division

SQL query, How to get entities not used in at least one input group or not used in any groups?


Let me elaborate with examples.

I have three tables Entity(EntityId int), Group(GroupId int), and Mapping(MapId int, GroupId int, EntityId int).

Let's take a few rows of data.

Entity
EntityId
1
2
3
4

Group
GroupId
101
102
103

Mapping
MapId GroupId EntityId
1       101      2
2       102      1
3       102      3
4       103      3

I need to select entities that are not used in at least one input group or are not used in any non-input groups.

Input-GroupId Output-Entities
   101          1,3,4 ( 2 is mapped with 101)
   101,102      1,2,3,4( 1 is mapped with 102 but not with 101 and 2 is mapped with 101 but not with 102)
   102,103      1,2,4 ( 3 is mapped with both 102 and 103 so it will not be in output.)  

If there are two input groups. According to the below ven diagram, I need values of all cross-red line areas.

https://www.onlinemath4all.com/venn-diagram-of-a-intersection-b-whole-complement.html

I tried a few join queries but I am able to get only values that are not used in any groups.

Edit 1:-

SELECT distinct EntityId from Entity e
JOIN Mapping map ON e.EntityId != map.EntityId 
JOIN #temp t ON map.GroupId = t.GroupId  order by EntityId

I have tried this query. It works fine for Input-Groups- (101), (101, 102) but does not work for the case (102, 103).

Edit 2:- A few comments mentioned that the ven diagram is not clear. So this edit.

Let's consider, A(101) and B(102), two groups are provided as input. Hand made ven diagram then all red dotted areas will be considered for the output. The blue pen highlighted area will not be considered.


Solution

  • For a fixed list of groups given as input, you want entities that do not belong to all groups.

    That’s a variant of the relational division problem. I would recommend passing the list of groups as rows using values(), and combining it with the entities table. Then we can bring the mapping table with a left join, and use the having clause to filter on the count of matches.

    with p as (select * from ( values (101), (102) ) p(groupId) )
    select e.entityId
    from p
    cross join entity e
    left join mapping m
        on  m.entityId = e.entityId
        and m.groupId  = p.groupId
    group by e.entityId
    having count(m.entityId) <> ( select count(*) from p)
    

    As an alternative for input, you could have the query accept a comma-separated list of values (hence as a single string parameter), and split it to rows with string_split - if that’s easier for the client application to manage.

    Here is a demo on DB Fiddle, which yields the expected results for the three test cases of your question:

    Input (GroupId)   Output (EntityId)
    101               1,3,4
    101,102           1,2,3,4
    102,103           1,2,4