My task is to flag "boss level" participants in data collected from meetings, based on a given org chart.
In other words, it needs to identify the highest ranked individual(s) within the given meeting, the "boss", within the reporting line (branch or subtree in the org tree).
If participants have no boss involved (within their subtree / reporting structure), they become the boss.
If there is a boss present (within their subtree / reporting structure), they are not the boss.
Meetings can have multiple bosses (when they are not on the same branch of the tree).
So the employee closest to the root is the boss, and all his superiors would not be the boss in the meeting, in case they are present.
Tools: SQL Server (Version 2019 or higher)
Org chart tree: Org Chart
Org chart data:
Name | Superior |
---|---|
Root | NULL |
Billy | Root |
Jacob | Billy |
Susan | Billy |
Tracy | Billy |
Sheri | Billy |
Carlos | Tracy |
Andrew | Tracy |
Hunter | Sheri |
Charly | Root |
Larry | Charly |
Harry | Charly |
Toni | Charly |
Chris | Larry |
Michael | Harry |
Anita | Toni |
Meeting data and expected results (Boss):
Meeting | Name | Meeting Boss (expected results) |
---|---|---|
Meeting 1 | Tracy | Y |
Meeting 1 | Hunter | Y |
Meeting 2 | Billy | Y |
Meeting 2 | Charly | Y |
Meeting 2 | Hunter | N |
Meeting 2 | Larry | N |
Meeting 3 | Billy | Y |
Meeting 3 | Hunter | N |
Meeting 3 | Anita | Y |
Meeting 4 | Anita | Y |
Meeting 4 | Billy | Y |
Meeting 5 | Anita | Y |
Meeting 6 | Billy | Y |
Meeting 6 | Charly | Y |
My guess is that a recursive CTE might be needed here, but wonder if there are simpler methods to get this done. Also interested in the commonly used terminology to describe this problem. Thank you!
Test data:
Select *
From
(
values
('Root', NULL),
('Billy', 'Root'),
('Jacob', 'Billy'),
('Susan', 'Billy'),
('Tracy', 'Billy'),
('Sheri', 'Billy'),
('Carlos', 'Tracy'),
('Andrew', 'Tracy'),
('Hunter', 'Sheri'),
('Charly', 'Root'),
('Larry', 'Charl'),
('Harry', 'Charly'),
('Toni', 'Charly'),
('Chris', 'Larry'),
('Michael', 'Harry'),
('Anita', 'Toni')
) A (Name, Superior)
Select *
From
(
values
('Meeting 1', 'Tracy', 'Y'),
('Meeting 1', 'Hunter', 'Y'),
('Meeting 2', 'Billy', 'Y'),
('Meeting 2', 'Charly', 'Y'),
('Meeting 2', 'Hunter', 'N'),
('Meeting 2', 'Larry', 'N'),
('Meeting 3', 'Billy', 'Y'),
('Meeting 3', 'Hunter', 'N'),
('Meeting 3', 'Anita', 'Y'),
('Meeting 4', 'Anita', 'Y'),
('Meeting 4', 'Billy', 'Y'),
('Meeting 5', 'Anita', 'Y'),
('Meeting 6', 'Billy', 'Y'),
('Meeting 6', 'Charly', 'Y')
) A ([Meeting], [Name], [Meeting Boss (expected results)])
I can think of two approaches, but both will require a recursive CTE to walk the hierarchy.
One approach is to start with each participant and use a recursive CTE to walk the hierarchy in search of a superior who is also attending the same meeting. If found, the starting participant is not a boss. If none are found, that participant is a boss.
WITH AttendeeSuperiors AS (
SELECT M.Meeting, M.Participant, M.Expected, O.Superior
FROM MeetingData M
JOIN OrgChart O
ON O.Employee = M.Participant
UNION ALL
SELECT ATS.Meeting, ATS.Participant, ATS.Expected, O.Superior
FROM AttendeeSuperiors ATS
JOIN OrgChart O
ON O.Employee = ATS.Superior
)
SELECT
ATS.Meeting, ATS.Participant, ATS.Expected,
CASE WHEN MAX(M2.Participant) IS NULL THEN 'Y' ELSE 'N' END AS Boss
FROM AttendeeSuperiors ATS
LEFT JOIN MeetingData M2
ON M2.Meeting = ATS.Meeting
AND M2.Participant = ATS.Superior
GROUP BY ATS.Meeting, ATS.Participant, ATS.Expected
ORDER BY ATS.Meeting, ATS.Participant;
(There may be a slight performance advantage in checking for superior participation within the CTE, so that we can stop the recursion on first match. See the linked demo for a modified version.)
Another approach is to use a recursive CTE to build a complete list of all superiors for each employee. (Ideally, this could be written to a table and indexed for efficient repeated use.) Next, for each meeting participant, we would search that superior list for one that is also an attendee of the same meeting. If found, the starting participant is not a boss. If none are found, that participant is a boss.
WITH EmployeeSuperiors AS (
SELECT O.Employee, O.Superior
FROM OrgChart O
--WHERE O.Employee IN (SELECT DISTINCT Participant FROM MeetingData)
UNION ALL
SELECT S.Employee, O.Superior
FROM EmployeeSuperiors S
JOIN OrgChart O
ON O.Employee = S.Superior
)
SELECT
M.*,
CASE WHEN NOT EXISTS(
SELECT *
FROM EmployeeSuperiors S
JOIN MeetingData M2
ON M2.Meeting = M.Meeting
AND M2.Participant = S.Superior
WHERE S.Employee = M.Participant
) THEN 'Y' ELSE 'N' END AS Boss
FROM MeetingData M
ORDER BY M.Meeting, M.Participant;
If we are calculating EmployeeSuperiors
on-the-fly for each execution (and not persisting that data), we can limit that calculation to just those employees participating in the meetings by uncommenting the WHERE
condition above.
As for which approach is more efficient, I believe the first version may be better for a small number of meetings and participants relative to the total number of employees. The second approach may be better if you will be processing many meetings and have the ability to define, populate, index, and maintain a persistent EmployeeSuperiors
table. In any case, I suggest that you run your own performance tests.
See this db<>fiddle for a demo.