I'm working with a dataset (in SSMS) that looks something like this:
CaseID | State | EventTimeStamp | Cleared |
---|---|---|---|
1 | OrderReceived | 2024-03-20 | 0 |
1 | Completed | 2024-03-21 | 0 |
1 | Completed | 2024-03-22 | 1 |
2 | OrderReceived | 2024-03-24 | 0 |
2 | Completed | 2024-03-25 | 0 |
2 | Completed | 2024-03-26 | 1 |
2 | Completed | 2024-03-27 | 0 |
3 | OrderReceived | 2024-03-28 | 0 |
3 | OrderReceived | 2024-03-29 | 1 |
3 | Completed | 2024-03-31 | 0 |
EDIT FOR CONTEXT: The actual dataset has ~40 States, but I'd like to track about 25. Some states REQUIRE other states to be marked done, but others don't and some can be in progress simultaneously. Any state can have an event Clear = 1, but it would require a Clear = 0 event first i.e. an event that didn't happen can't be cleared. Events for any state will happen in 0, 1, 0,... Cleared order every single time.
For the purpose of this dummy dataset the MAX(EventTimeStamp) for an OrderReceived could be Cleared = 1, and the MAX(EventTimeStamp) for the Completed event for the same CaseID could be Cleared = 0. There is also potential for a case to have MAX(EventTimeStamp) for OrderReceived to be NULL and MAX(EventTimeStamp) for Completed to have a value. Additionally, it is possible for MAX(EventTimeStamp) for OrderReceived to be AFTER MAX(EventTimeStamp) for Completed. In that case the desired result would simply look like
CaseID | OrderReceived | Completed |
---|---|---|
X | 2024-03-20 | 2024-03-18 |
In that case the last EventTimeStamp would display for each. The engineer who made the database made didn't hardcode a lot of restrictions because a lot of changes are still being made to the workflow.
CREATE TABLE CaseTracking
(
[CaseID] bigint,
[State] VARCHAR(512),
[EventTimeStamp] date,
[Cleared] bit
);
INSERT INTO CaseTracking ([CaseID], [State], EventTimeStamp, Cleared) VALUES
('1', 'OrderReceived', '2024-03-20', '0'),
('1', 'Completed', '2024-03-21', '0'),
('1', 'Completed', '2024-03-22', '1'),
('2', 'OrderReceived', '2024-03-24', '0'),
('2', 'Completed', '2024-03-25', '0'),
('2', 'Completed', '2024-03-26', '1'),
('2', 'Completed', '2024-03-27', '0'),
('3', 'OrderReceived', '2024-03-28', '0'),
('3', 'OrderReceived', '2024-03-29', '1'),
('3', 'Completed', '2024-03-31', '0');
SELECT * FROM CaseTracking
I'd like to take a look at how long it takes between certain states and so I organized the table by column State using "group by". I was hoping to make a table that evaluated the MAX(EventTimeStamp) for each state, but if Cleared = 1 then the value would return NULL instead. The desired result would look like this. For this database, states can be cleared and it's recorded the same as any other event, but with Cleared = 1. If the MAX(TimeEventStamp) has a Cleared = 1 value that means it still needs to be done which is why I'd like a NULL return. The value for CaseID = 2 Completed is 2024-3-27 since I'd want the value associated with the MAX(EventTimeStamp). If Cleared happens in 0, 1, 0 order I'd like the timestamp of the last 0. If Cleared happens in 0, 1, 0, 1 order I'd like a NULL return
CaseID | OrderReceived | Completed |
---|---|---|
1 | 2024-03-20 | NULL |
2 | 2024-03-24 | 2024-03-27 |
3 | NULL | 2024-03-31 |
I've been playing around in https://dbfiddle.uk/JG38HeOH
I tried an AND conditional within the CASE clause, but ran into an issue where instead of returning the NULL I want for CaseID = 1, I instead get the MAX(TimeEventStamp) for any time the conditions are met
SELECT CaseID
,MAX(CASE WHEN State = 'OrderReceived' AND Cleared = 0 THEN EventTimeStamp
ELSE NULL END) AS OrderReceived
,MAX(CASE WHEN State = 'Completed' AND Cleared = 0 THEN EventTimeStamp
ELSE NULL END) AS Completed
FROM CaseTracking
GROUP BY CaseID
Results look like this
CaseID | OrderReceived | Completed |
---|---|---|
1 | 2024-03-20 | 2024-03-21 |
2 | 2024-03-24 | 2024-03-27 |
This gets the latest row for each state of each case, then pivots the rows to one row per case.
WITH
sorted AS
(
SELECT
*,
ROW_NUMBER()
OVER (
PARTITION BY CaseID, State
ORDER BY EventTimeStamp DESC
)
AS row_id
FROM
CaseTracking
)
SELECT
CaseID,
MAX(CASE WHEN State = 'OrderReceived' THEN EventTimeStamp END) AS OrderReceived,
MAX(CASE WHEN State = 'Completed' THEN EventTimeStamp END) AS Completed
FROM
sorted
WHERE
row_id = 1
AND
Cleared = 0
GROUP BY
CaseID
CaseID | OrderReceived | Completed |
---|---|---|
1 | 2024-03-20 | null |
2 | 2024-03-24 | 2024-03-27 |