Search code examples
sqlhierarchical-query

Identifying Keys for new hierarchies


Assume my source looks like

X1  H_ID    l1_no   l2_no   l3_no
20  1       2       12      42
20  1       6       16      42
20  1       1       19      41
20  1       5       15      41

And my lookup table looks like

X1  H_ID    l1_no   l1_key  l2_no   l2_key  l3_no   l3_key
20  1       4       104         14      114         44      144
20  1       3       103         13      113         43      143
20  1       1       101         11      111         41      141
20  1       2       102         12      112         42      142

In this case, My record is considered as an Update only if I have a complete match and the complete keys will be retained. If it doesnt match for any of the column, then it is considered as an Insert. The tricky part is even for inserts, every hierarchy levels must be checked and the keys must be retained for matching bits.

For example Assuming I have 1->11->14 already and I verify 1->11->15 then this must be an insert record but the keys for 1 and 11 must be retained and for 15 it must be generated

Also If I get a 1->12->14 then in this case, the record will be considered as an insert and the key must be retained only for 1 and new keys must be generated for 12 and 14

One possible solution I can think of is doing Left Join for each levels by treating the data set as a 2D matrix. But its not a great way to do it.

WITH lookup ( x1, h_id,  l1_no, l1_key, l2_no, l2_key,l3_no,l3_key ) AS
         (SELECT 20, 1, 1, 101, 11, 111, 41, 141 FROM sys.dual UNION ALL
          SELECT 20, 1, 2, 102, 12, 112, 42, 142 FROM sys.dual UNION ALL
          SELECT 20, 1, 3, 103, 13, 113, 43, 143 FROM sys.dual UNION ALL
          SELECT 20, 1, 4, 104, 14, 114, 44, 144 FROM sys.dual),
               dwh_keys AS 
         (SELECT x1, h_id, MAX(l1_key) mk1,  MAX(l2_key) mk2,  MAX(l3_key) mk3 
          FROM  lookup GROUP BY 1,2) ,
     src ( x1, h_id, l1_no,  l2_no, l3_no ) AS
         (SELECT 20, 1, 1,  19, 41  FROM sys.dual UNION ALL
          SELECT 20, 1, 2,  12, 42  FROM sys.dual UNION ALL
          SELECT 20, 1, 5,  15, 41  FROM sys.dual UNION ALL
          SELECT 20, 1, 6,  16, 42  FROM sys.dual),
     stg1 AS
         (SELECT      b.*, 
                      CASE WHEN a.x1 IS NOT NULL THEN 'U' ELSE 'I' END AS delta_type
          FROM        src b LEFT OUTER JOIN lookup a ON a.x1 = b.x1 AND a.h_id = b.h_id 
          AND a.l1_no = b.l1_no AND a.l2_no = b.l2_no AND a.l3_no = b.l3_no)
SELECT      a.*, 
         CASE WHEN delta_type = 'I' AND b.l1_no IS NULL  THEN mk1 + ROW_NUMBER ( ) OVER (ORDER BY delta_type) ELSE b.l1_key END AS new1_key,
         CASE WHEN delta_type = 'I' AND c.l2_no IS NULL  THEN mk2 + ROW_NUMBER ( ) OVER (ORDER BY delta_type) ELSE c.l2_key END AS new1_key,
         CASE WHEN delta_type = 'I' AND d.l3_no IS NULL  THEN mk3 + ROW_NUMBER ( ) OVER (ORDER BY delta_type) ELSE d.l3_key END AS new1_key,
         delta_type
FROM            stg1 a 
LEFT OUTER JOIN lookup b ON a.x1 = b.x1 AND a.h_id = b.h_id AND a.l1_no = b.l1_no 
LEFT OUTER JOIN lookup c ON a.x1 = c.x1 AND a.h_id = c.h_id AND a.l1_no = c.l1_no AND a.l2_no = c.l2_no         
LEFT OUTER JOIN lookup d ON a.x1 = d.x1 AND a.h_id = d.h_id AND a.l1_no = d.l1_no AND a.l2_no = d.l2_no AND a.l3_no = d.l3_no 
LEFT OUTER JOIN dwh_keys xx ON a.x1 = xx.x1 AND a.h_id = xx.h_id;

Output I get (Also what I want)

X1  H_ID    L1_NO   L2_NO   L3_NO   DELTA_TYPE  NEW1_KEY    NEW1_KEY    NEW1_KEY    DELTA_TYPE
20  1   6   16  42  I   105 115 145 I
20  1   5   15  41  I   106 116 146 I
20  1   1   19  41  I   101 117 147 I
20  1   2   12  42  U   102 112 142 U

My database WX2 doesnt support recursive CTE and all I have is plain ANSI SQL syntax except recursive CTE to play with.


Solution

  • Finally found a solution, though a modified version of my initial suggestion

    WITH lookup ( x1, h_id, l1_no, l1_key, l2_no, l2_key, l3_no, l3_key ) AS
             (SELECT 20, 1, 1, 101, 11, 111, 41, 141 FROM sys.DUAL UNION ALL
              SELECT 20, 1, 1, 101, 12, 112, 42, 142 FROM sys.DUAL UNION ALL
              SELECT 20, 1, 3, 102, 13, 113, 43, 143 FROM sys.DUAL UNION ALL
              SELECT 20, 1, 4, 103, 14, 114, 44, 144 FROM sys.DUAL),
         dwh_keys AS
             (SELECT      x1, h_id, MAX ( l1_key ) mk1, MAX ( l2_key ) mk2, MAX ( l3_key ) mk3
              FROM        lookup
              GROUP BY    1, 2),
         src ( x1, h_id, l1_no, l2_no, l3_no ) AS
             (SELECT 20, 1, 1, 19, 41 FROM sys.DUAL UNION ALL
              SELECT 20, 1, 1, 12, 42 FROM sys.DUAL UNION ALL
              SELECT 20, 1, 5, 15, 41 FROM sys.DUAL UNION ALL
              SELECT 20, 1, 6, 16, 42 FROM sys.DUAL),
         stg1 AS
             (SELECT    DISTINCT b.*, CASE WHEN a.x1 IS NOT NULL THEN 'U' ELSE 'I' END AS delta_type
              FROM      src b
                        LEFT OUTER JOIN
                        lookup a
                            ON     a.x1 = b.x1
                               AND a.h_id = b.h_id
                               AND a.l1_no = b.l1_no
                               AND a.l2_no = b.l2_no
                               AND a.l3_no = b.l3_no),
         stg2 AS
             (SELECT    a.*, CASE WHEN delta_type = 'I' AND b.l1_no IS NULL THEN 0 ELSE 1 END AS flag1, b.l1_key AS k1,
                        CASE WHEN delta_type = 'I' AND c.l2_no IS NULL THEN 0 ELSE 1 END AS flag2, c.l2_key AS k2,
                        CASE WHEN delta_type = 'I' AND d.l3_no IS NULL THEN 0 ELSE 1 END AS flag3, d.l3_key AS k3
              FROM      stg1 a
                        LEFT OUTER JOIN (SELECT DISTINCT x1, h_id, l1_no, l1_key FROM lookup) b
                            ON a.x1 = b.x1 AND a.h_id = b.h_id AND a.l1_no = b.l1_no
                        LEFT OUTER JOIN (SELECT DISTINCT x1, h_id, l1_no, l1_key, l2_no, l2_key FROM lookup) c
                            ON a.x1 = c.x1 AND a.h_id = c.h_id AND a.l1_no = c.l1_no AND a.l2_no = c.l2_no
                        LEFT OUTER JOIN
                        (SELECT DISTINCT x1, h_id, l1_no, l1_key, l2_no, l2_key, l3_no, l3_key FROM lookup) d
                            ON     a.x1 = d.x1
                               AND a.h_id = d.h_id
                               AND a.l1_no = d.l1_no
                               AND a.l2_no = d.l2_no
                               AND a.l3_no = d.l3_no),
         keys AS
             (SELECT    a.*,
                        CASE WHEN flag1 = 0 THEN mk1 + DENSE_RANK ( ) OVER (ORDER BY a.delta_type, l1_no) ELSE a.k1 END AS new_key1,
                        CASE WHEN flag2 = 0 THEN mk2 + DENSE_RANK ( ) OVER (ORDER BY a.delta_type, l1_no, l2_no) ELSE a.k2 END AS new_key2,
                        CASE WHEN flag3 = 0 THEN mk3 + DENSE_RANK ( ) OVER (ORDER BY a.delta_type, l1_no, l2_no, l3_no) ELSE a.k3 END AS new_key3
              FROM      stg2 a LEFT OUTER JOIN dwh_keys b ON a.x1 = b.x1 AND a.h_id = b.h_id)
    SELECT      x1, h_id, l1_no, l2_no, l3_no, delta_type, new_key1, new_key2, new_key3
    FROM        keys
    ORDER BY    1, 2, 3, 4, 5;