Search code examples
sqlsql-serverparent-childsql-server-2017

How to build parent-child hierarchies (starting from root) in SQL Server?


I am trying to get the path from the root node to its children using SQL server.

The source data looks like :

Source Data

The target should look like :

Target Data

As I am going to implement this in an ETL tool exclusively using the ETL transformations, I would like to achieve this output without using CONNECT BY equivalent approach. Below query gets me the result and a few more records :

select case when level02.geography_02 is not NULL
    then '3'
    else case when level01.geography_02 is not null
                then '2'
                else case when root.geography_02 is not null
                    then '1'
                    end 
        end
end as levels,
root.geography_01 as root, root.geography_02 as super_parent,
case when level01.geography_02 is not null
        then level01.geography_02
        else ''
        end as parent,
case when level02.geography_02 is not null
        then level02.geography_02
        else ''
        end as child
from geo_table root
left join geo_table level01
on root.geography_02 = level01.geography_01
left join geo_table level02
on level01.geography_02 = level02.geography_01

Can you please on how to get the desired output?


Solution

  • I think you just need some filtering. That said, the rest of your query can also be simplified a bit -- particularly by using COALESCE():

    select (case when level02.geoghraphy_02 is not NULL then '3'
                 when level01.geoghraphy_02 is not null then '2'
                 when root.geoghraphy_02 is not null then '1'
            end) as levels,
           root.geoghraphy_01 as root,
           root.geoghraphy_02 as super_parent,
           coalesce(level01.geography_02, '') as parent,
           coalesce(level02.geography_02, '') as child
    from geo_table root left join
         geo_table level01
         on root.geography_02 = level01.geography_01 left join
         geo_table level02
         on level01.geography_02 = level02.geography_01
    where not exists (select 1
                      from geo_table gt
                      where gt.geography_02 = root.geography_01
                     );
    

    Basically, you just need to limit the "root" to actual root records. You've actually handled the trickier part of the logic (in my opinion).