Search code examples
sql-servert-sqlcommon-table-expression

Improving Performance of Recursive CTE for calculating totals


I have a SQL Server database with an Employee table containing the following structure:

CREATE TABLE Employee 
(
    EmployeeID INT PRIMARY KEY,
    ManagerID INT,
    EmployeeName NVARCHAR(50),
    Salary DECIMAL(18, 2)
);

I'm using a recursive Common Table Expression (CTE) to calculate the total salaries of each employee and their subordinates. However, with a large number of employees, the query performance becomes sluggish.

WITH RecursiveSalaryCTE AS 
(
    SELECT 
        EmployeeID, ManagerID, EmployeeName, Salary
    FROM 
        Employee
    WHERE 
        ManagerID IS NULL

    UNION ALL

    SELECT 
        e.EmployeeID, e.ManagerID, e.EmployeeName, e.Salary
    FROM 
        Employee e
    JOIN 
        RecursiveSalaryCTE r ON e.ManagerID = r.EmployeeID
)
SELECT 
    EmployeeID, EmployeeName, Salary,
    (SELECT SUM(Salary) FROM RecursiveSalaryCTE 
     WHERE ManagerID = e.EmployeeID) AS TotalSubordinateSalaries
FROM 
    RecursiveSalaryCTE e;

Is there a better way to optimize this query? Any alternative approaches or modifications to the query that could provide faster results would greatly appreciated.

Example data for the Employee table:

INSERT INTO Employee (EmployeeID, ManagerID, EmployeeName, Salary)
VALUES (1, NULL, 'John', 10000.00),
       (2, 1, 'Alice', 7500.00),
       (3, 1, 'Bob', 8000.00),
       -- ... more data ...
       (1000, 999, 'Eve', 6000.00);

Solution

  • Avoid using a sub-query as part of the final SELECT should improve performance and if you actually want the total of all people depending of a manager at any depth level you have to group by common path expression (thus you have to collect the path in the recursive CTE) ..., grouping by managerID gives you the total of direct subordinates only.

    with RecursiveSalaryCTE(EmployeeID, managerid, salary, path) as (
        SELECT EmployeeID, managerid, Salary, cast(concat('/', EmployeeID) as varchar(1000))
        FROM employee
        WHERE managerid IS NULL
        
        UNION ALL
        
        SELECT e.EmployeeID, e.managerid, e.salary, cast(concat(r.path, '/', e.EmployeeID) as varchar(1000))
        FROM employee e
        JOIN RecursiveSalaryCTE r ON e.managerid = r.EmployeeID
    
    )
    select EmployeeID, managerid, salary, Total_all_sub, 
        sum(salary) over(partition by managerid) as total_same_manager
    from (
        select r.EmployeeID, r.managerid, r.salary,
            SUM(r1.Salary) Total_all_sub
        from RecursiveSalaryCTE r
        join RecursiveSalaryCTE r1 on substring(r1.path,1,len(r.path)) = r.path 
        group by r.EmployeeID, r.managerid, r.salary
    ) d
    order by employeeid
    ;
    

    https://dbfiddle.uk/O1pMAy3p