In the dataset I have many Case ID’s all of them that goes through some statuses, the process is not fixed, but at a point in time the Case's should be Closed in the CLSD status.
For the two CaseId examples I would like to get (for each) the rows that is before including CLSD row. For instance, CaseId 789101 I would like to get all the rows except NCMP and PLIP since they have a higher FromDateTime and / or EventNumber than CLSD row. There is cases that has now reach CLSD status yet and those I want all the rows from.
ToDateTimeWithNull column with NULL values represent the current status of the CaseId.
This can be solved with EXISTS selection:
WITH yourdataset AS (
SELECT *
FROM
(
VALUES (N'123456', N'INIT', N'2024-01-20 08:44:10.000', N'2024-01-22 12:04:24.000', N'2024-01-22 12:04:24.000', 1)
, (N'123456', N'CANC', N'2024-01-22 12:04:24.000', N'2024-01-22 12:04:24.000', N'2024-01-22 12:04:24.000', 2)
, (N'123456', N'CLSD', N'2024-01-22 12:04:24.000', N'2024-01-22 12:04:24.000', N'2024-01-22 12:04:24.000', 3)
, (N'123456', N'NCMP', N'2024-01-22 12:04:24.000', N'2024-05-21 11:51:24.017', NULL, N'4')
, (N'789101', N'INIT', N'2023-01-02 08:22:21.000', N'2023-03-19 15:07:14.000', N'2023-03-19 15:07:14.000', 1)
, (N'789101', N'AWAP', N'2023-03-19 15:07:14.000', N'2023-03-19 15:07:35.000', N'2023-03-19 15:07:35.000', 2)
, (N'789101', N'RICT', N'2023-03-19 15:07:35.000', N'2023-03-19 15:15:31.000', N'2023-03-19 15:15:31.000', 3)
, (N'789101', N'CANC', N'2023-03-19 15:15:31.000', N'2023-03-20 05:00:28.000', N'2023-03-20 05:00:28.000', 4)
, (N'789101', N'CLSD', N'2023-03-20 05:00:28.000', N'2023-03-20 05:00:28.000', N'2023-03-20 05:00:28.000', 5)
, (N'789101', N'NCMP', N'2023-03-20 05:00:28.000', N'2024-05-13 08:22:35.000', N'2024-05-13 08:22:35.000', 6)
, (N'789101', N'PLIP', N'2024-05-13 08:22:35.000', N'2024-05-21 11:51:24.017', NULL, N'7')
) t (caseid, activityname, fromdatetime, todatetime, todatetimewithnull, eventnumber)
)
SELECT * FROM yourdataset d
WHERE EXISTS(
SELECT 1
FROM yourdataset d2
WHERE d2.caseID = d.CaseID
AND d2.ActivityName = 'CLSD'
AND d2.EventNumber >= d.EventNumber
)
You look "forward" for each caseID until you hit the condition you want, in this case ActivityName='CLSD'.
I used EventNumber as sequence value, but if your index is on CaseID,FromDateTime, you can use: d2.FromDateTime >= d.FromDateTime
This works well, unless you have multiple CLSD events.