Search code examples
sqlsql-serverjpawhere-clauseconjunctive-normal-form

SQL where clause with seperataed IN expression to implement CNF


I need to query users according to thier dynamically added properties (transposed table)

I have 4 tables

  • users - for users
  • propertyGroup - for groups of properties, groups can be added dynamically to the db
  • propertyValue - for possible values of propertyGroup
  • usersPropertyValues - connect user to his relevant properties

Its look like this:

users:
------
| id | name | details |
|----|------|---------|
|  1 |  Joe |     foo |
|  2 |  Dan |     bar |

propertyGroup:
--------------
| id |           name |
|----|----------------|
|  1 |        Hobbies |
|  2 |       LikeFood |
|  3 | VisitedCountry |

propertyValue:
--------------
| id | propertyGroupId |         name |
|----|-----------------|--------------|
|  1 |               1 | Technologies |
|  2 |               1 |      Surfing |
|  3 |               2 |         Rice |
|  4 |               2 |         Meat |
|  5 |               2 |          Veg |
|  6 |               3 |          USA |
|  7 |               3 |       FRANCE |
|  8 |               3 |       ISRAEL |
|  9 |               3 |       CANADA |

usersPropertyValues:
--------------------
| userId | propertyValueId |
|--------|-----------------|
|      1 |               1 |
|      1 |               2 |
|      1 |               3 |
|      1 |               5 |
|      1 |               6 |
|      1 |               7 |
|      1 |               8 |
|      2 |               2 |
|      2 |               3 |
|      2 |               4 |
|      2 |               5 |
|      2 |               8 |
|      2 |               9 |
|      2 |               7 |

so a mix-em-all query will look like this:

select * 
from users as u
 join usersPropertyValues as upv on upv.userId = u.id
 join propertyValues as pv on pv.id = upv.propertyValueId
 join propertyGroup as pg on pg.id = pv.propertyGroupId

| id | name | details | userId | propertyValueId | id | propertyGroupId |         name | id |           name |
|----|------|---------|--------|-----------------|----|-----------------|--------------|----|----------------|
|  1 |  Joe |     foo |      1 |               1 |  1 |               1 | Technologies |  1 |        Hobbies |
|  1 |  Joe |     foo |      1 |               2 |  2 |               1 |      Surfing |  1 |        Hobbies |
|  1 |  Joe |     foo |      1 |               3 |  3 |               2 |         Rice |  2 |       LikeFood |
|  1 |  Joe |     foo |      1 |               5 |  5 |               2 |          Veg |  2 |       LikeFood |
|  1 |  Joe |     foo |      1 |               6 |  6 |               3 |          USA |  3 | VisitedCountry |
|  1 |  Joe |     foo |      1 |               7 |  7 |               3 |       FRANCE |  3 | VisitedCountry |
|  1 |  Joe |     foo |      1 |               8 |  8 |               3 |       ISRAEL |  3 | VisitedCountry |
|  2 |  Dan |     bar |      2 |               2 |  2 |               1 |      Surfing |  1 |        Hobbies |
|  2 |  Dan |     bar |      2 |               3 |  3 |               2 |         Rice |  2 |       LikeFood |
|  2 |  Dan |     bar |      2 |               4 |  4 |               2 |         Meat |  2 |       LikeFood |
|  2 |  Dan |     bar |      2 |               5 |  5 |               2 |          Veg |  2 |       LikeFood |
|  2 |  Dan |     bar |      2 |               8 |  8 |               3 |       ISRAEL |  3 | VisitedCountry |
|  2 |  Dan |     bar |      2 |               9 |  9 |               3 |       CANADA |  3 | VisitedCountry |
|  2 |  Dan |     bar |      2 |               7 |  7 |               3 |       FRANCE |  3 | VisitedCountry |

its all here: http://sqlfiddle.com/#!3/49329/1

I want to query the data by a set of propertyValueId and get all the users that matches a this set of properties with regarding to the group so the user need to have at least on property from every group - in othe words, a where calause that matches the general CNF clause like this: (pvId is abbreviation for propertyValueId)

Property group A        Property group B              Property group C
(pvId_x or pvId_y) and (pvId_w or pvId_z) and... and (pvId_m or pvId_k) 

in the above example pvId_x&pvId_y belonges to group A, pvId_w&pvId_z belonges to group B etc.

what I did not working, I tried to concat AND oparators of IN operators( in simulates the or part - the disjunctions) as showen here (query for the above sql fiddle):

select distinct u.name, u.id  
from users as u
 join usersPropertyValues as upv on upv.userId = u.id
 join propertyValues as pv on pv.id = upv.propertyValueId
 join propertyGroup as pg on pg.id = pv.propertyGroupId
 where (pv.propertyGroupId = 1 AND pv.id IN(1,2)) and (pv.propertyGroupId = 2 AND pv.id IN(5,6))

instead of getting both users (the both have (1 or 2) and (5 or 6) ) - i got none.

I understand why the result set is empty but I don understand how to implement the right where clause. - how it can be done?

My Question: How can I implement the CNF logic in the above SQL structure?

Edit:Excepted results examples: (regarding the sqlfiddle example:

input  --> output 
{1,5}  --> Joe (user with hobby:tech, likefood:veg)
{2,8}  --> Joe,Dan (user with hobby:surfing, VisitedCountry:Israel)
{2,8,9}  --> Dan (user with hobby:surfing, VisitedCountry:Israel,Canada)

BTW, I need to implement this finally in JPA so if there is a solution in JPA it's also great. but if not - I will translate it... Thanks


Solution

  • As I understand your notation you want to query for all users that are associated with two propertyvalueid (so {1,5} goes to property value ids of hobby:tech, likefood:veg)

    Once you phrase it like that it is simple first get a list of one then get a list of the other and find the elements in both, like this:

    select *
    from users 
    where id in (
      select userid from userPropertyValues where propertyvalueid  = 1
      intersect
      select userid from userPropertyValues where propertyvalueid  = 5
    ) sub
    

    note I could use an inner join instead of the intersect operator but intersect is sexier. If your platform does not support it just use a join.


    As a join:

    select *
    from users 
    join (select userid from userPropertyValues where propertyvalueid=1) a on a.userid = users.id
    join (select userid from userPropertyValues where propertyvalueid=5) b on b.userid = users.id
    

    or more simply stated as (this is AND condition - 1 and 5)

    select *
    from users 
    join userPropertyValues a on a.userid = users.id and a.propertyvalueid=1
    join userPropertyValues b on b.userid = users.id and b.propertyvalueid=5
    

    (this is OR condition - 1 OR 5)

    select *
    from users 
    left join userPropertyValues a on a.userid = users.id and a.propertyvalueid=1
    left join userPropertyValues b on b.userid = users.id and b.propertyvalueid=5
    where coalesce(a.userid, b.userid) is not null
    

    You could also say

    where a.userid is not null or b.userid is not null
    

    ---

    from the fiddle

    select *
    from users 
    where id in (
      select userid from usersPropertyValues where propertyvalueid  = 1
      intersect 
      (
      select userid from usersPropertyValues where propertyvalueid  = 3 
        UNION ALL
      select userid from usersPropertyValues where propertyvalueid  = 4
      )
    );
    

    lets see part before the intersect is

    select *
    from users 
    join userPropertyValues a on a.userid = users.id and a.propertyvalueid=1
    

    part after is

    select *
    from users 
    left join userPropertyValues b on b.userid = users.id and b.propertyvalueid=1
    left join userPropertyValues c on c.userid = users.id and c.propertyvalueid=5
    where coalesce(b.userid, c.userid) is not null
    

    so put them together (both are true for an intersect since it is the same as AND):

    select *
    from users 
    join userPropertyValues a on a.userid = users.id and a.propertyvalueid=1
    left join userPropertyValues b on b.userid = users.id and b.propertyvalueid=1
    left join userPropertyValues c on c.userid = users.id and c.propertyvalueid=5
    where coalesce(b.userid, c.userid) is not null