Search code examples
sqlpostgresqlgreenplum

Identify the missing combinations by comparing two tables


I have 3 columns in my main table.

  1. Score ( 0-99,100-110)
  2. rate (5-9 ,10-15)
  3. location (A, B)

And I have all the combinations of those (2 * 2 * 2 = 8 combinations)

Below are the combinations in my main table

score     rate   location
----------------------------
 0-99      5-9     A
100-110    5-9     A
 0-99     10-15    A
100-110   10-15    A
0-99       5-9     B
100-110    5-9     B
0-99      10-15    B
100-110   10-15    B

I have another table with the actual data. I want to find out all the missing combinations in the actual table. How to find those missing combinations and append to the actual table with value as '0' in the column?

Actual data

score     rate   location  value 
---------------------------------
 0-99     10-15    A         3
100-110   10-15    A         6
0-99      10-15    B         1

Expected output

 score     rate   location  value 
------------------------------------
 0-99        5-9     A           0   
 0-99        10-15   A           3
100-110     10-15    A           6
100-110      5-9     B           0
0-99        10-15    B           1
100-110     5-9      A           0
100-110    10-15     B           0
0-99       10-15     B           0

Solution

  • From your actual data and expect result score,rate,location columns value seem to fixed, so you can use UNION ALL to let score,rate,location be tables.

    CROSS JOIN to generate a Cartesian product for score,rate,location Union tables, make the full table.

    Then do OUTER JOIN

    create table t(
      score varchar(50),
      rate varchar(50),
      location  varchar(50),
      value  int
    );
    
    
    insert into t values ('0-99','10-15','A',3);   
    insert into t values ('100-110','10-15','A',6);
    insert into t values ('0-99','10-15','B',1);
    

    Query 1:

    SELECT  
      s.score,
      r.rate,
      l.location,
      coalesce(t1.value,0)
    FROM 
    (
      SELECT '0-99' score
      UNION ALL
      SELECT '100-110'
    ) s
    CROSS JOIN
    (
      SELECT '10-15' rate
      UNION ALL
      SELECT '5-9'
    ) r
    CROSS JOIN
    (
      SELECT 'A' as "location"
      UNION ALL
      SELECT 'B'
    ) l
    LEFT JOIN t t1 on s.score = t1.score and t1.rate = r.rate and t1.location = l.location
    ORDER BY  l.location  
    

    Results:

    |   score |  rate | location | coalesce |
    |---------|-------|----------|----------|
    |    0-99 | 10-15 |        A |        3 |
    |    0-99 |   5-9 |        A |        0 |
    | 100-110 | 10-15 |        A |        6 |
    | 100-110 |   5-9 |        A |        0 |
    | 100-110 |   5-9 |        B |        0 |
    |    0-99 | 10-15 |        B |        1 |
    | 100-110 | 10-15 |        B |        0 |
    |    0-99 |   5-9 |        B |        0 |