I am trying to write a query to return currently unfilled training requirements.
The 4 main tables involved are:
Here is what I have so far:
SELECT *
FROM TrainingRequirements
INNER JOIN Users on Users.Id = TrainingRequirements.UserId
INNER JOIN Courses on Courses.Id = TrainingRequirements.CourseId
LEFT OUTER JOIN UserTrainings on UserTrainings .CourseId = TrainReq.CourseId
AND UserTrainings .UserId = TrainingRequirements.UserId
WHERE
Users.Active = 1 AND
TrainingRequirements.Active = 1 AND
(UserTrainings.Id is null OR
--covers if they never took the training at all
GETDATE() >= DATEADD(month, Courses.MonthsValid, UserTrainings.DateCompleted))
--covers if they took the training but it has expired and must be retaken
This almost works. It returns me instances where they never took the training (so UserTraining.Id is null because no matching record exists on the right side of the left outer join), but it also returns to me a bunch of older completed trainings that are no longer applicable.
If a training is required annually, the user training would accumulate multiple records over the years. I only want to compare it to the most recent completion, that is the highest DateCompleted per user per training, so I assume I need to use some sort of MAX() or Group By or subquery in the join, but I can't figure it out.
Sample Data: TrainingRequirements
Id | UserId | CourseId | active
1 | 1 |1 | 1
2 | 1 |2 | 1
3 | 1 |3 | 0
4 | 1 |4 | 1
UserTrainings
Id | UserId | CourseId | DateCompleted
1 |1 |1 | 2022-10-23 12:20:11.526
2 |1 |2 | 2021-11-15 05:01:12.320
3 |1 |2 | 2020-12-11 11:11:40.320
Courses
Id | Name | MonthsValid
1 | 'Cool Training' | 36
2 | 'Secret Training' | 12
3 | 'Top Secret Training' | 6
4 | 'Unpopular Training' | 12
In the above, Requirement 3 is not active so can be ignored, Requirement 1 has a completion on 10/23/2023 and is valid for 36 months, so it's not expired and can also be ignored. 2 and 4 should be returned. 4 was never completed, and 2 was completed but since the expiration date (date of previous completion + months valid)
Desired Results
TrainingRequirementId | UserId | CourseId | LastTaken | Expired
2 | 1 | 2 | 11/15/2021| 11/15/2022
4 | 1 | 4 | null | null
Using my query returns an additional row because the UserTraining with ID 3 is also matched. But since that is older an older completion I want to exclude it.
So without actual data to run this against, I'm kind of fumbling in the dark (i.e. I can't verify this actually does what you want) but it sounds like you want to get users who need to take a course, or took it, but over n months ago.
To do this, you need the first row for each combination of UserID
and CourseID
when ordered by DateCompleted
. The function row_number is a window function which helps with this. It adds a series of numbers (i.e. 1, 2, 3, etc.) over each "partition" (e.g. in your case UserID
, CourseId
) ordered by whatever you choose (in this case, DateCompleted desc
) and since you order it by DateCompleted desc
the row with the value of 1 will always be the most recenty for that partition.
You should be able to run something like this to see what that looks like:
select
RID = row_number() over (partition by UserID, CourseId order by DateCompleted desc),
CourseId,
UserID,
DateCompleted
from UserTrainings
From there, it's back to what you were doing before, except instead of joining to the actual UserTrainings
table, you join to the CTE
you created with the row_number
over it, and you add the additional join predicate where u.RID = 1
to only grab the rows out of the CTE
where DateCompleted
is the most recent.
Here's what I came up with. Again, without actual data, you may need to futz with this a bit, but I think this should get you on the right track.
;with ut as
(
select
RID = row_number() over (partition by UserID, CourseId order by DateCompleted desc),
CourseId,
UserID,
DateCompleted
from UserTrainings
)
select *
from TrainingRequirements tr
inner join users u
on tr.UserID = u.UserID
inner join courses c
on tr.CourseId = c.CourseID
left outer join ut u
on r.CourseId = u.CourseId
and r.UserID = u.UserId
and u.RID = 1 -- This gets the most recent record for each user/course in UserTrainings, sorted by DateCompleted
where u.CourseID is null -- they haven't taken it
or ut.DateCompleted < dateadd(month, r.MonthsValid, getdate())