Search code examples
sql-serverazure-sql-database

How to return cumulative totals per year including for years where there is no data


I have a database in SQL Azure with this structure:

CREATE TABLE Years
(
    YearID INT IDENTITY PRIMARY KEY,
    YearName VARCHAR(8)
)

INSERT INTO Years (YearName)
VALUES ('2017-18'),('2018-19'),('2019-20'),('2020-21'),('2021-22'),('2022-23')

CREATE TABLE People
(
    PersonID INT PRIMARY KEY IDENTITY,
    FirstName VARCHAR(100)
)

INSERT INTO People (FirstName)
VALUES ('Fred'), ('Bob'), ('Samit'), ('Kate')

CREATE TABLE WorkCategories 
(
    CatID INT IDENTITY PRIMARY KEY,
    CatName VARCHAR(100)
)

INSERT INTO WorkCategories (CatName) 
VALUES ('Testing'), ('Examining'), ('Teaching'), ('Marking')

CREATE TABLE WorkDone
(
    WorkID INT PRIMARY KEY IDENTITY,
    PersonID INT FOREIGN KEY REFERENCES People(PersonID),
    WorkCategory INT FOREIGN KEY REFERENCES WorkCategories(CatID),
    YearID INT FOREIGN KEY REFERENCES Years(YearID),
    HoursWorked INT
)

INSERT INTO WorkDone (PersonID, WorkCategory, YearID, HoursWorked)
VALUES (1, 1, 1, 10), (1, 1, 2, 30), (1, 1, 3, 15),
       (1, 1, 6, 25), (1, 2, 1, 24), (1, 2, 4, 28), 
       (1, 2, 4, 28), (1, 2, 5, 20), (1, 2, 6, 10)

I want to be able to get a cumulative total of the work done, per category for each person for any given year.

So far, I have created this view:

CREATE VIEW vw_TotalWorkDone
AS
    SELECT
        y.YearName,
        WorkInfo.PersonID, WorkInfo.CatName, 
        WorkInfo.YearID, WorkInfo.TotalHours,
        SUM(WorkInfo.TotalHours) OVER 
(PARTITION BY WorkInfo.PersonID, WorkInfo.CatName ORDER BY y.YearName DESC) AS TotalSinceDate
    FROM 
        Years y
    LEFT JOIN 
        (SELECT
             wc.CatName, wd.PersonID, wd.YearID,
             SUM(wd.HoursWorked) AS TotalHours
         FROM 
             WorkCategories wc
         LEFT JOIN 
             WorkDone wd ON wc.CatID = wd.WorkCategory
         GROUP BY 
             wc.CatName, wd.PersonID, wd.YearID) WorkInfo ON y.YearID = WorkInfo.YearID

The results returned by my view are accurate, looking like this:

YearName PersonID CatName YearID TotalHours TotalSinceDate
2022-23 1 Examining 6 10 10
2021-22 1 Examining 5 20 30
2020-21 1 Examining 4 56 86
2017-18 1 Examining 1 24 110
2022-23 1 Testing 6 25 25
2019-20 1 Testing 3 15 40
2018-19 1 Testing 2 30 70
2017-18 1 Testing 1 10 80

However what I want is to be able to do is use the view to allow me to see how much work the selected person has done of each type since the specified year, like this:

SELECT *
FROM vw_TotalWorkDone 
WHERE PersonID=1 
AND  YearName='2018-19'

to get my answer.

The problem is that although I'll see the total testing done, I won't see anything for examining, because it happens that this person didn't do any examining in that year.

What I need, therefore, is to have each year reported, with the cumulative total back to that year, even if there are no records of the selected type for that year - so more like this (I've left out testing for the sake of brevity, but it should report all categories where any work has been done at any time in the period):

YearName PersonID CatName YearID TotalHours TotalSinceDate
2022-23 1 Examining 6 10 10
2021-22 1 Examining 5 20 30
2020-21 1 Examining 4 56 86
2019-20 1 Examining 4 0 86
2018-19 1 Examining 4 0 86
2017-18 1 Examining 1 24 110

Can anyone advise how best to do this please?


Solution

  • You (presumbly) want to CROSS JOIN your Years, People and WorkCategories tables, and then LEFT JOIN to your WorkDone table. Then you can use a windowed SUM to get the running total.

    I have a WHERE in the below as your expected results are only for a specific person and category, however, if you removed those it would still work, generating a row for each distinct set of year, person and category.

    SELECT Y.YearName,
           P.PersonID,
           WC.CatID,
           WC.CatName,
           Y.YearID,
           ISNULL(WD.HoursWorked,0) AS Totalhours,
           SUM(WD.HoursWorked) OVER (PARTITION BY P.PersonID, WC.CatName ORDER BY Y.YearName DESC
                                     ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS TotalSinceDate
    FROM dbo.Years Y
         CROSS JOIN dbo.People P
         CROSS JOIN dbo.WorkCategories WC
         LEFT JOIN dbo.WorkDone WD ON Y.YearID = WD.YearID
                                  AND P.PersonID = WD.PersonID
                                  AND WC.CatID = WD.WorkCategory
    WHERE P.PersonID = 1
      AND WC.CatName = 'Examining'
    ORDER BY P.PersonID,
             WC.CatID,
             Y.YearName DESC;
    

    db<>fiddle