Search code examples
sql-servert-sqlhierarchical-data

Flag all bosses in meeting based on org tree


My task is to flag "boss level" participants in data collected from meetings, based on a given org chart.

In other words, it needs to identify the highest ranked individual(s) within the given meeting, the "boss", within the reporting line (branch or subtree in the org tree).

If participants have no boss involved (within their subtree / reporting structure), they become the boss.

If there is a boss present (within their subtree / reporting structure), they are not the boss.

Meetings can have multiple bosses (when they are not on the same branch of the tree).

So the employee closest to the root is the boss, and all his superiors would not be the boss in the meeting, in case they are present.

Tools: SQL Server (Version 2019 or higher)

Org chart tree: Org Chart

Org chart data:

Name Superior
Root NULL
Billy Root
Jacob Billy
Susan Billy
Tracy Billy
Sheri Billy
Carlos Tracy
Andrew Tracy
Hunter Sheri
Charly Root
Larry Charly
Harry Charly
Toni Charly
Chris Larry
Michael Harry
Anita Toni

Meeting data and expected results (Boss):

Meeting Name Meeting Boss (expected results)
Meeting 1 Tracy Y
Meeting 1 Hunter Y
Meeting 2 Billy Y
Meeting 2 Charly Y
Meeting 2 Hunter N
Meeting 2 Larry N
Meeting 3 Billy Y
Meeting 3 Hunter N
Meeting 3 Anita Y
Meeting 4 Anita Y
Meeting 4 Billy Y
Meeting 5 Anita Y
Meeting 6 Billy Y
Meeting 6 Charly Y

My guess is that a recursive CTE might be needed here, but wonder if there are simpler methods to get this done. Also interested in the commonly used terminology to describe this problem. Thank you!


Test data:

Select *
From
(
    values
        ('Root', NULL),
        ('Billy', 'Root'),
        ('Jacob', 'Billy'),
        ('Susan', 'Billy'),
        ('Tracy', 'Billy'),
        ('Sheri', 'Billy'),
        ('Carlos', 'Tracy'),
        ('Andrew', 'Tracy'),
        ('Hunter', 'Sheri'),
        ('Charly', 'Root'),
        ('Larry', 'Charl'),
        ('Harry', 'Charly'),
        ('Toni', 'Charly'),
        ('Chris', 'Larry'),
        ('Michael', 'Harry'),
        ('Anita', 'Toni')
) A (Name, Superior)


Select *
From
(
    values
        ('Meeting 1', 'Tracy', 'Y'),
        ('Meeting 1', 'Hunter', 'Y'),
        ('Meeting 2', 'Billy', 'Y'),
        ('Meeting 2', 'Charly', 'Y'),
        ('Meeting 2', 'Hunter', 'N'),
        ('Meeting 2', 'Larry', 'N'),
        ('Meeting 3', 'Billy', 'Y'),
        ('Meeting 3', 'Hunter', 'N'),
        ('Meeting 3', 'Anita', 'Y'),
        ('Meeting 4', 'Anita', 'Y'),
        ('Meeting 4', 'Billy', 'Y'),
        ('Meeting 5', 'Anita', 'Y'),
        ('Meeting 6', 'Billy', 'Y'),
        ('Meeting 6', 'Charly', 'Y')
) A ([Meeting], [Name], [Meeting Boss  (expected results)])

Solution

  • I can think of two approaches, but both will require a recursive CTE to walk the hierarchy.

    One approach is to start with each participant and use a recursive CTE to walk the hierarchy in search of a superior who is also attending the same meeting. If found, the starting participant is not a boss. If none are found, that participant is a boss.

    WITH AttendeeSuperiors AS (
        SELECT M.Meeting, M.Participant, M.Expected, O.Superior
        FROM MeetingData M
        JOIN OrgChart O
            ON O.Employee = M.Participant
    
        UNION ALL
    
        SELECT ATS.Meeting, ATS.Participant, ATS.Expected, O.Superior
        FROM AttendeeSuperiors ATS
        JOIN OrgChart O
            ON O.Employee = ATS.Superior
    )
    SELECT
        ATS.Meeting, ATS.Participant, ATS.Expected,
        CASE WHEN MAX(M2.Participant) IS NULL THEN 'Y' ELSE 'N' END AS Boss
    FROM AttendeeSuperiors ATS
    LEFT JOIN MeetingData M2
        ON M2.Meeting = ATS.Meeting
        AND M2.Participant = ATS.Superior
    GROUP BY ATS.Meeting, ATS.Participant, ATS.Expected
    ORDER BY ATS.Meeting, ATS.Participant;
    

    (There may be a slight performance advantage in checking for superior participation within the CTE, so that we can stop the recursion on first match. See the linked demo for a modified version.)

    Another approach is to use a recursive CTE to build a complete list of all superiors for each employee. (Ideally, this could be written to a table and indexed for efficient repeated use.) Next, for each meeting participant, we would search that superior list for one that is also an attendee of the same meeting. If found, the starting participant is not a boss. If none are found, that participant is a boss.

    WITH EmployeeSuperiors AS (
        SELECT O.Employee, O.Superior
        FROM OrgChart O
        --WHERE O.Employee IN (SELECT DISTINCT Participant FROM MeetingData) 
    
        UNION ALL
    
        SELECT S.Employee, O.Superior
        FROM EmployeeSuperiors S
        JOIN OrgChart O
            ON O.Employee = S.Superior
    )
    SELECT
        M.*,
        CASE WHEN NOT EXISTS(
                SELECT *
                FROM EmployeeSuperiors S
                JOIN MeetingData M2
                    ON M2.Meeting = M.Meeting
                    AND M2.Participant = S.Superior
                WHERE S.Employee = M.Participant
            ) THEN 'Y' ELSE 'N' END AS Boss
    FROM MeetingData M
    ORDER BY M.Meeting, M.Participant;
    

    If we are calculating EmployeeSuperiors on-the-fly for each execution (and not persisting that data), we can limit that calculation to just those employees participating in the meetings by uncommenting the WHERE condition above.

    As for which approach is more efficient, I believe the first version may be better for a small number of meetings and participants relative to the total number of employees. The second approach may be better if you will be processing many meetings and have the ability to define, populate, index, and maintain a persistent EmployeeSuperiors table. In any case, I suggest that you run your own performance tests.

    See this db<>fiddle for a demo.