Search code examples
sql-serverhierarchy

Best Way to Store Hierarchical Items of Different Types in SQL Server


The project I am working on has 7 levels to their business hierarchy. None of them are of the same type. Meaning, this is not an organizational chart and all of the items are Employees of some level or other. They are things like Division, Region, Sales VP, Business Unit and such. Yes, some of them are perhaps Employees, but not all of them.

Currently, I have them each in their own table that follow a similar pattern to each other where each child has a foreign key to their parent. So starting with the smallest part of the hierarchy:

BusinessUnit (table)

ID
Name
AreaManagerID

AreaManager (table)

ID
Name
RegionalManagerID

RegionalManager (table)

ID
Name
DivisionID

Division (table)

ID
Name

There are 3 more tables intermixed, but this should show you the rather simple link between each level of the hierarchy. Every child must have a parent. There will not be any AreaManager that has no BusinessUnits.

Reading up a bit on the HierarchyID I am not totally sure it will help me.

I know the above works and it is fine. But I am more wondering if there is a better way and/or faster way when I am tasked with being given a Division and need to find all of the BU's within it. Or even being given a Region and needing to find all of the BU's within it.


Solution

  • If you're looking for "get me descendants", HierarchyID is pretty fast at finding descendants to an arbitrary depth. If you were to do that, I'd put all of the entities into one table with break out tables for the different types. It would look a little something like this:

    CREATE TABLE [dbo].[BusinessEntity] (
        [EntityID] INT IDENTITY NOT NULL PRIMARY KEY,
        [ParentEntityID] INT
            REFERENCES [dbo].[BusinessEntity] ([EntityID]),
        [EntityType] TINYINT NOT NULL,
        [Path] HIERARCHYID
    );
    
    CREATE TABLE [dbo].[BusinessUnit] (
        [ID] INT NOT NULL PRIMARY KEY
            REFERENCES [dbo].[BusinessEntity] ([EntityID]),
        [Name] VARCHAR(255) NOT NULL,
        [AreaManagerID] INT NOT NULL
    );
    
    CREATE TABLE [dbo].[AreaManager] (
        [ID] INT NOT NULL PRIMARY KEY
            REFERENCES [dbo].[BusinessEntity] ([EntityID]),
        [Name] VARCHAR(255) NOT NULL,
        [RegionalManagerID] INT NOT NULL
    );
    
    CREATE TABLE [dbo].[RegionalManager] (
        [ID] INT NOT NULL PRIMARY KEY
            REFERENCES [dbo].[BusinessEntity] ([EntityID]),
        [Name] VARCHAR(255) NOT NULL,
        [DivisionID] INT NOT NULL
    );
    
    CREATE TABLE [dbo].[Division] (
        [ID] INT NOT NULL PRIMARY KEY
            REFERENCES [dbo].[BusinessEntity] ([EntityID]),
        [Name] VARCHAR(255)
    );
    

    When you go to insert into one of your actual tables (e.g. BusinessUnit, RegionalManager, etc), you'd first create a record in BusinessEntity and then use the generated identity value as the identifier for the insert. You'll also need to keep the Path column up to date with respect to its relationship in the hierarchy. That is, let's say that I have the following data in BusinessEntity:

    SET IDENTITY_INSERT [dbo].[BusinessEntity] ON;
    INSERT INTO [dbo].[BusinessEntity]
            ( [EntityID],
              [ParentEntityID] ,
              [EntityType] 
            )
    VALUES  
        (1, NULL, 1),
        (2, 1, 2),
        (3, 1, 2),
        (4, 2, 3),
        (5, 3, 3),
        (6, 4, 4),
        (7, 6, 5);
    

    Then I can use the following CTE to generate the Path values

    WITH cte AS (
        SELECT [be].[EntityID], [be].[ParentEntityID], CAST(CONCAT('/', [be].[EntityID], '/') AS VARCHAR(MAX)) AS [Path] 
        FROM [dbo].[BusinessEntity] AS [be]
        WHERE [be].[ParentEntityID] IS null
    
        UNION ALL
    
        SELECT [child].[EntityID], [child].[ParentEntityID], CAST(CONCAT([parent].[Path], child.[EntityID], '/') AS VARCHAR(MAX))
        FROM [dbo].[BusinessEntity] AS [child]
        JOIN [cte] AS [parent]
            ON [child].[ParentEntityID] = [parent].[EntityID]
    
    )
    UPDATE [be]
    SET [be].[Path] = cte.[Path]
    FROM [dbo].[BusinessEntity] AS be
    JOIN cte
        ON [be].[EntityID] = [cte].[EntityID]
    WHERE [Path] IS NULL;
    

    Of course, keeping them up to date is a lot easier. When you insert a new row, grab the Path from the parent row, tack your ID onto it, and that's your Path. Updating a row's parent is a little trickier, but not terrible. I'll leave it as an exercise for the reader. But as a hint, it involves the GetReparentedValue() method of the HierarchyID data type. Finally, if ever you don't trust the value in Path (as it is a derived value), you can just set whatever values you don't trust to NULL and re-run the above cte update.