Search code examples
sqlsql-servert-sqlsql-server-2016gaps-and-islands

SQL Server (terminal result) hierarchy map


In SQL Server 2016, I have a table with the following chaining structure:

dbo.Item

OriginalItem ItemID
NULL 7
1 2
NULL 1
5 6
3 4
NULL 8
NULL 5
9 11
2 3

EDIT NOTE: Bold numbers were added as a response to @lemon comments below

Importantly, this example is a trivialized version of the real data, and the neatly ascending entries is not something that is present in the actual data, I'm just doing that to simplify the understanding.

I've constructed a query to get what I'm calling the TerminalItemID, which in this example case is ItemID 4, 6, and 7, and populated that into a temporary table @TerminalItems, the resultset of which would look like:

@TerminalItems

TerminalItemID
4
6
7
8
11

What I need, is a final mapping table that would look something like this (using the above example -- note that it also contains for 4, 6, and 7 mapping to themselves, this is needed by the business logic):

@Mapping

ItemID TerminalItemID
1 4
2 4
3 4
4 4
5 6
6 6
7 7
8 8
9 11
11 11

What I need help with is how to build this last @Mapping table. Any assistance in this direction is greatly appreciated!


Solution

  • This is a typical gaps-and-islands problem and can also be carried out without recursion in three steps:

    • assign 1 at the beginning of each partition
    • compute a running sum over your flag value (generated at step 1)
    • extract the max "ItemID" on your partition (generated at step 2)
    WITH cte1 AS (
        SELECT *, CASE WHEN OriginalItem IS NULL THEN 1 ELSE 0 END AS changepartition 
        FROM Item
    ), cte2 AS (
        SELECT *, SUM(changepartition) OVER(ORDER BY ItemID) AS parts
        FROM cte1
    )
    SELECT ItemID, MAX(ItemID) OVER(PARTITION BY parts) AS TerminalItemID 
    FROM cte2
    

    Check the demo here.

    Assumption: Your terminal id items correspond to the "ItemID" value preceding a NULL "OriginalItem" value.


    EDIT: "Fixing orphaned records."

    The query works correctly when records are not orphaned. The only way to deal them, is to get missing records back, so that the query can work correctly on the full data.

    This is carried out by an extra subquery (done at the beginning), that will apply a UNION ALL between:

    • the available records of the original table
    • the missing records
    WITH fix_orphaned_records AS(
        SELECT * FROM Item
    
        UNION ALL
    
        SELECT NULL AS OriginalItem, 
               i1.OriginalItem AS ItemID
        FROM      Item i1
        LEFT JOIN Item i2 ON i1.OriginalItem = i2.ItemID
        WHERE i1.OriginalItem IS NOT NULL AND i2.ItemID IS NULL
    ), cte AS (
     ...
    

    Missing records correspond to "OriginalItem" values that are never found within the "ItemID" field. A self left join will uncover these missing records.

    Check the demo here.