Search code examples
sqlpostgresqlrelational-division

SQL filter an exact set of values


I am trying to find rows that have an exact set of matching values.

For example, if I was looking for ID 0,1 and 2 I would query for the type, size and name date using the values ((7,2),(3,2),(7,1)).

ID     Type    Size   
----------------------
0      7      2       
1      3      2       
2      7      1       
3      7      6       
4      7      2       
5      7      null  

The answer to this query would be as follows... (note ID 4 is also returned as it has the same type/size as ID 0)

ID     Type    Size    
---------------------
0      7      2       
1      3      2       
2      7      1       
4      7      2       

I've tried to do this with the following query SELECT * FROM my_table WHERE (Type,Size) IN ((7,2),(3,2),(7,1))

However this does not work if one of the values does not exist, for example

SELECT * FROM my_table WHERE (type,size) IN ((7,2),(3,2),(7,99)) should return null as (7,99) does not exist

It also breaks if a null value is included

https://www.db-fiddle.com/f/2vxEEQNMhnK9oFGAhrQXip/2


Solution

  • This is a variation on Relational Division With Remainder problem.

    You are trying to get the set of rows from the table (the dividend) that match the input set (the divisor), but you may have duplicates, which complicates matters.

    One solution is to left-join the main table to the data to find, then look for missing rows by checking that COUNT(*) is equal to COUNT(maintable.id), which it would if everything was matched. We need to do this count as a window function, because we want the final result to include all rows, not just a yes/no answer.

    We can use IS NOT DISTINCT FROM to compare nulls correctly.

    WITH ToFind(type, size) AS (
        VALUES
          (7::int,2::int),
          (3,2),
          (7,99)
    ),
    Matches AS (
        SELECT
          mt.*,
          COUNT(*) OVER () - COUNT(mt.id) OVER () AS countMissing
        FROM ToFind tf
        LEFT JOIN my_table mt
          ON (tf.type, tf.size) IS NOT DISTINCT FROM (mt.type, mt.size)
    )
    SELECT
      m.id,
      m.type,
      m.size
    FROM Matches m
    WHERE m.countMissing = 0;
    

    db<>fiddle