I have a query that generates a monthly report for drivers evaluations. Some drivers may not have reports for some months. The reports contain various of violations, one of these violations is a monthly cumulated violation, and it resets every year which is done in in separate query. Everything where working just fine, except when the client added a new requirement to this part. The requirement is to change it from resetting every year, to resetting any violation that passed 180 days during the year.
Here is a report's sample (for one driver report):
RN ReportId DriverId StartDate EndDate Level1Vio Lv1YTD
-------------------- -------------------- -------------------- ----------------------- ----------------------- ----------- -----------
1 64 2073 2020-10-21 00:00:00.000 2020-11-21 23:59:59.000 1 1
2 65 2073 2020-11-24 05:13:04.133 2020-12-24 05:13:04.133 0 1
3 67 2073 2020-12-23 06:53:52.870 2021-01-23 06:53:52.870 0 1
4 68 2073 2021-01-22 06:33:43.127 2021-02-22 06:33:43.127 0 1
5 69 2073 2021-02-23 04:02:58.680 2021-03-23 04:02:58.680 1 2
6 70 2073 2021-03-22 23:39:33.570 2021-04-22 23:39:33.570 0 2
7 71 2073 2021-04-22 00:28:35.230 2021-05-22 00:28:35.230 0 2
8 72 2073 2021-05-22 15:46:21.767 2021-06-22 15:46:21.767 1 3
9 73 2073 2021-06-25 06:42:02.130 2021-07-25 06:42:02.130 1 4
10 76 2073 2021-07-23 17:42:01.533 2021-08-23 17:42:01.533 0 4
The old query that where working was (generates the above sample) :
SELECT
RN
, ReportId
, DriverId
, StartDate
, EndDate
, Level1Vio
, Lv1YTD = SUM(Level1Vio) OVER(PARTITION BY DriverId ORDER BY ReportId ROWS UNBOUNDED PRECEDING)
FROM Report
The excepted result should be (for the new requirement):
RN ReportId DriverId StartDate EndDate Level1Vio Lv1YTD
-------------------- -------------------- -------------------- ----------------------- ----------------------- ----------- -----------
1 64 2073 2020-10-21 00:00:00.000 2020-11-21 23:59:59.000 1 1
2 65 2073 2020-11-24 05:13:04.133 2020-12-24 05:13:04.133 0 1
3 67 2073 2020-12-23 06:53:52.870 2021-01-23 06:53:52.870 0 1
4 68 2073 2021-01-22 06:33:43.127 2021-02-22 06:33:43.127 0 1
5 69 2073 2021-02-23 04:02:58.680 2021-03-23 04:02:58.680 1 2
6 70 2073 2021-03-22 23:39:33.570 2021-04-22 23:39:33.570 0 2
7 71 2073 2021-04-22 00:28:35.230 2021-05-22 00:28:35.230 0 1
8 72 2073 2021-05-22 15:46:21.767 2021-06-22 15:46:21.767 1 2
9 73 2073 2021-06-25 06:42:02.130 2021-07-25 06:42:02.130 1 3
10 76 2073 2021-07-23 17:42:01.533 2021-08-23 17:42:01.533 0 3
Explanation
If you take ReportId = 64
(Nov 2020 Report) the driver has a registered violation, which should be removed after 180 days, and then in ReportId = 69
(Mar 2021 Report) got another violation. The first violation should end at ReportId = 71
(May 2021 Report) and the second one should be removed at ReportId = 76
(Aug 2021 Report). This principle is applied on each violation.
Trys
I'm trying hard to avoid cursor
and while
loops actually, but I couldn't find a way to do this under one query SELECT
like what I did in the above query. I've tried CTE
and joins, but with no luck. Everything forces me to use loops, which I try to avoid (because of performance issues). I'm sure there is a way to solve it with simple SELECT
, but I'm out of thoughts. So, any thoughts or solutions would be much appreciated.
Instead of window functions, we can just join with the records which overlap by 180 days.
Updated fiddle: Working Test Case
SELECT t1.rn, t1.ReportId, t1.DriverId, t1.StartDate, t1.EndDate, t1.Level1Vio
, SUM(t2.Level1Vio) AS Lv1YTD
FROM Report AS t1
JOIN Report AS t2
ON t1.StartDate BETWEEN t2.StartDate AND DATEADD(day, 180, t2.StartDate)
AND t1.DriverId = t2.DriverId
GROUP BY t1.rn, t1.DriverId, t1.ReportId, t1.StartDate, t1.EndDate, t1.Level1Vio
ORDER BY t1.StartDate
;
Note: I used 180 days from the start of the reported violation. Adjust as needed.
Also note: I included a version which uses a CTE term. You can use that if you wish to filter the rows of interest on both sides of the join more conveniently, restricting them to a specific year, or just add that year restriction to the join logic or add a WHERE
clause.
There are quite a few ways to include the year logic all within this query, without resorting to separate queries to generate per year.