Search code examples
sqlsubset

SQL - Find subsets that meet combination of conditions


This example is a bit silly but it's basically what I'm dealing with... Let's say I've got 2 tables, T_PARENT and T_CHILDREN, where a parent can have 0..N children. A simplified example of the tables structure and data is as following:

  • T_PARENT
IDPARENT NAME
1 Carlos
  • T_CHILDREN
IDPARENT NAME AGE HEIGHT
1 Juan 9 120
1 Juan 9 110
1 Pablo 9 130
1 Pablo 9 120
1 Pablo 7 110
1 Diego 9 110
1 Diego 9 100

I need to find in the most efficient way which parents have children that meet the following conditions (this is an example of possible combinations):

  1. At least 1 child where name='Pablo' (doesn't matter age or height).
  2. At least 1 child where name='Juan' and age=9 (doesn't matter height).
  3. At least 1 child where name='Diego' and age=9 and height=110
  4. At least 1 child where name='Diego' and age=9 and height=120

In this example with data, parent with IDPARENT=1 shouldn't be retrieved, because he doesn't have a child named Diego of age 9 and height 120.

NOTE: T_CHILDREN has an important restriction: values of NAME-AGE-HEIGHT columns must be unique- So for example, in this world as a parent, you can have N children named Pablo of age 9, but they cannot have same height.

So far, I can only think of accomplishing this doing 3 subqueries in worst case scenario:

SELECT p.IDPARENT
FROM T_PARENT p
WHERE 1 <= (
    SELECT count(*)
    FROM T_CHILDREN c
    where c.IDPARENT = p.IDPARENT
    and c.NAME = 'Pablo'
)
AND 1 <= (
    SELECT count(*)
    FROM T_CHILDREN c
    WHERE c.IDPARENT = p.IDPARENT
    AND c.NAME = 'Juan' AND c.AGE = 9
)
AND 2 <= (
    SELECT count(*)
    FROM T_CHILDREN c
    WHERE c.IDPARENT = p.IDPARENT
    AND (c.NAME = 'Diego' AND c.AGE = 9 AND c.HEIGHT = 110)
    OR (c.NAME = 'Diego' AND c.AGE = 9 AND c.HEIGHT = 120)
);

Solution

  • One option is to use conditional aggregation as the following:

    SELECT T.IDPARENT, T.NAME
    FROM T_PARENT T JOIN T_CHILDREN D
    ON T.IDPARENT = D.IDPARENT
    GROUP BY T.IDPARENT, T.NAME
    HAVING
    COUNT(CASE WHEN D.NAME='Pablo' THEN 1 END) > 0 AND
    COUNT(CASE WHEN D.NAME='Juan' AND D.AGE=9 THEN 1 END) > 0 AND
    COUNT(CASE WHEN D.NAME='Diego' AND D.AGE=9 AND D.HEIGHT=110 THEN 1 END) > 0 AND
    COUNT(CASE WHEN D.NAME='Diego' AND D.AGE=9 AND D.HEIGHT=120 THEN 1 END) > 0
    ORDER BY T.IDPARENT, T.NAME
    

    See a demo.

    Another option is to use EXISTS operator as the following:

    SELECT T.IDPARENT, T.NAME
    FROM T_PARENT T 
    WHERE EXISTS(SELECT 1 FROM T_CHILDREN D WHERE T.IDPARENT = D.IDPARENT AND D.NAME='Pablo')
    AND EXISTS(SELECT 1 FROM T_CHILDREN D WHERE T.IDPARENT = D.IDPARENT AND  D.NAME='Juan' AND D.AGE=9)
    AND EXISTS(SELECT 1 FROM T_CHILDREN D WHERE T.IDPARENT = D.IDPARENT AND D.NAME='Diego' AND D.AGE=9 AND D.HEIGHT=110)
    AND EXISTS(SELECT 1 FROM T_CHILDREN D WHERE T.IDPARENT = D.IDPARENT AND D.NAME='Diego' AND D.AGE=9 AND D.HEIGHT=120)
    

    See a demo.