Search code examples
sqlsnowflake-cloud-data-platformcommon-table-expressionrecursive-query

CTE query in snowflake to traverse the hierarchy


I have an emp table that has data like below in Snowflake. I need to add a column with the col name ED_ID will will point to the immediate ED_id in the hierarchy. If there is no ED in the employee's hierarchy then ED_ID will be null.

+-----------+---------+-----------+------------+-------+--------------+
| loginName | loginID | managerid | department | title | managertitle |
+-----------+---------+-----------+------------+-------+--------------+
| asr       |      26 |        56 | is         | as    | vp           |
| meh       |      56 |        34 | is         | vp    | ed           |
| pra       |      34 |       345 | is         | ed    | md           |
| dav       |     345 |        65 | is         | md    | md           |
| erc       |      65 |       908 | is         | md    | md           |
| mev       |     908 |         1 | is         | md    | ce           |
| rup       |      32 |        77 | ec         | vp    | vp           |
| pra       |      77 |        67 | ec         | vp    | vp           |
| ail       |      67 |       101 | ec         | vp    | md           |
| jen       |     101 |        97 | ec         | md    | md           |
| kim       |      97 |         1 | ec         | md    | ce           |
| shr       |      17 |        90 | td         | as    | vp           |
| dan       |      90 |       333 | td         | vp    | vp           |
| ruh       |     333 |       123 | td         | vp    | ed           |
| fin       |     123 |        39 | td         | ed    | ed           |
| sam       |      39 |         9 | td         | ed    | md           |
| aug       |       9 |       111 | td         | md    | md           |
| jun       |     111 |         1 | td         | md    | ce           |
+-----------+---------+-----------+------------+-------+--------------+

So recursive query should transform above table like below

+-----------+---------+-----------+------------+-------+--------------+------+
| loginName | loginID | managerid | department | title | managertitle | Edid |
+-----------+---------+-----------+------------+-------+--------------+------+
| asr       |      26 |        56 | is         | as    | vp           |   34 |
| meh       |      56 |        34 | is         | vp    | ed           |   34 |
| pra       |      34 |       345 | is         | ed    | md           |      |
| dav       |     345 |        65 | is         | md    | md           |      |
| erc       |      65 |       908 | is         | md    | md           |      |
| mev       |     908 |         1 | is         | md    | ce           |      |
| rup       |      32 |        77 | ec         | vp    | vp           |      |
| pra       |      77 |        67 | ec         | vp    | vp           |      |
| ail       |      67 |       101 | ec         | vp    | md           |      |
| jen       |     101 |        97 | ec         | md    | md           |      |
| kim       |      97 |         1 | ec         | md    | ce           |      |
| shr       |      17 |        90 | td         | as    | vp           |  123 |
| dan       |      90 |       333 | td         | vp    | vp           |  123 |
| ruh       |     333 |       123 | td         | vp    | ed           |  123 |
| fin       |     123 |        39 | td         | ed    | ed           |   39 |
| sam       |      39 |         9 | td         | ed    | md           |      |
| aug       |       9 |       111 | td         | md    | md           |      |
| jun       |     111 |         1 | td         | md    | ce           |      |
+-----------+---------+-----------+------------+-------+--------------+------+  

I tried to write the below query it gives me the immediate ed if exists in the given hierarchy however I want the additional column to be filled based on the logic explained.

WITH RECURSIVE ManagerHierarchy AS (
SELECT Employee_Name, Employee_Manager_ID, Employee_Designation
FROM your_table_name
WHERE Employee_Name = 'asr'
UNION ALL
SELECT t.Employee_Name, t.Employee_Manager_ID, t.Employee_Designation
FROM your_table_name t
JOIN ManagerHierarchy m ON t.Employee_ID = m.Employee_Manager_ID  ) SELECT Employee_Name FROM ManagerHierarchy WHERE Employee_Designation = 'ED' LIMIT 1;

Could you please help me with the snowflake SQL query which will satisfy my requirement?


Solution

  • To do this, can you create the new edid column in your recursive cte and conditionally fill it based on whether the managertitle is ed. In the recursive portion of the cte, if that check fails, then you just use the previous iterations edid.

    This will look something like:

    CREATE TABLE TableName (loginName varchar(10), loginID int, managerid int, department varchar(10), title varchar(10), managertitle varchar(10));
    INSERT INTO TableName(    loginName, loginID, managerid, department, title, managertitle) VALUES('    asr', 26, 56, 'is', 'as', 'vp')
    ,('    meh', 56, 34, 'is', 'vp', 'ed')
    ,('    pra', 34, 345, 'is', 'ed', 'md')
    ,('    dav', 345, 65, 'is', 'md', 'md')
    ,('    erc', 65, 908, 'is', 'md', 'md')
    ,('    mev', 908, 1, 'is', 'md', 'ce')
    ,('    rup', 32, 77, 'ec', 'vp', 'vp')
    ,('    pra', 77, 67, 'ec', 'vp', 'vp')
    ,('    ail', 67, 101, 'ec', 'vp', 'md')
    ,('    jen', 101, 97, 'ec', 'md', 'md')
    ,('    kim', 97, 1, 'ec', 'md', 'ce')
    ,('    shr', 17, 90, 'td', 'as', 'vp')
    ,('    dan', 90, 333, 'td', 'vp', 'vp')
    ,('    ruh', 333, 123, 'td', 'vp', 'ed')
    ,('    fin', 123, 39, 'td', 'ed', 'ed')
    ,('    sam', 39, 9, 'td', 'ed', 'md')
    ,('    aug', 9, 111, 'td', 'md', 'md')
    ,('    jun', 111, 1, 'td', 'md', 'ce');
    

    WITH RECURSIVE reccte AS
    (
      SELECT loginname, loginid, managerid, 
         department, title, managertitle, 
         CASE WHEN managertitle = 'ed' THEN managerid ELSE NULL END AS EDID, 
         managerid as parentid, 0 as recdepth
      FROM TableName --WHERE loginid = 26
      UNION ALL
      SELECT reccte.loginname, reccte.loginid, reccte.managerid, 
          reccte.department, reccte.title, reccte.managertitle, 
          CASE WHEN TableName.managertitle = 'ed' and reccte.edid IS NULL
             THEN TableName.managerid 
             ELSE reccte.edid END,
          TableName.managerid, reccte.recdepth + 1
      FROM reccte
        INNER JOIN TableName
           ON reccte.parentid = TableName.loginid
    )
    SELECT loginname, loginid, managerid, department, title, managertitle, edid
    FROM (SELECT *, row_number() OVER (PARTITION BY loginname ORDER BY recdepth desc) as rn FROM reccte) dt
    WHERE rn = 1
    

    +-----------+---------+-----------+------------+-------+--------------+------+
    | loginname | loginid | managerid | department | title | managertitle | edid |
    +-----------+---------+-----------+------------+-------+--------------+------+
    | ail       |      67 |       101 | ec         | vp    | md           | null |
    | asr       |      26 |        56 | is         | as    | vp           | 34   |
    | aug       |       9 |       111 | td         | md    | md           | null |
    | dan       |      90 |       333 | td         | vp    | vp           | 123  |
    | dav       |     345 |        65 | is         | md    | md           | null |
    | erc       |      65 |       908 | is         | md    | md           | null |
    | fin       |     123 |        39 | td         | ed    | ed           | 39   |
    | jen       |     101 |        97 | ec         | md    | md           | null |
    | jun       |     111 |         1 | td         | md    | ce           | null |
    | kim       |      97 |         1 | ec         | md    | ce           | null |
    | meh       |      56 |        34 | is         | vp    | ed           | 34   |
    | mev       |     908 |         1 | is         | md    | ce           | null |
    | pra       |      34 |       345 | is         | ed    | md           | null |
    | ruh       |     333 |       123 | td         | vp    | ed           | 123  |
    | rup       |      32 |        77 | ec         | vp    | vp           | null |
    | sam       |      39 |         9 | td         | ed    | md           | null |
    | shr       |      17 |        90 | td         | as    | vp           | 123  |
    +-----------+---------+-----------+------------+-------+--------------+------+
    

    Besides the case expression to populate edid as the recursive cte iterates up the hierarchy, there is also the bit in the actual SELECT at the bottom to grab only the last record that was iterated upon from the recursive portion of the sql. We do this by tracking the depth of the iteration in the recdepth column, and then only allowing the highest recdepth through to the final result set.