I have a table where information is being stored like so:
Work Order | Employee | FunctionGroup | FunctionType | Timestamp |
---|---|---|---|---|
WO1 | Emp1 | Group1 | Start | 7/27/23 09:00 |
WO1 | Emp1 | Group1 | Stop | 7/27/23 10:00 |
WO1 | Emp1 | Group1 | Start | 7/27/23 11:00 |
WO1 | Emp1 | Group1 | Stop | 7/27/23 12:00 |
WO2 | Emp2 | Group2 | Start | 7/27/23 13:00 |
WO2 | Emp2 | Group2 | Stop | 7/27/23 14:00 |
WO2 | Emp2 | Group2 | Start | 7/27/23 15:00 |
WO2 | Emp2 | Group2 | Stop | 7/27/23 16:00 |
WO3 | Emp3 | Group3 | Start | 7/27/23 17:00 (problem here: since the next row is also a Start, then this row should be returned) |
WO3 | Emp3 | Group3 | Start | 7/27/23 18:00 |
WO3 | Emp3 | Group3 | Start | 7/27/23 19:00 |
WO3 | Emp3 | Group3 | Stop | 7/27/23 20:00 |
WO4 | Emp4 | Group4 | Stop | 7/27/23 17:00 (problem here: since the dataset for this partition starts with a Stop instead of a Start, then this row should be returned) |
WO4 | Emp4 | Group4 | Start | 7/27/23 18:00 |
WO4 | Emp4 | Group4 | Start | 7/27/23 19:00 |
WO4 | Emp4 | Group4 | Stop | 7/27/23 20:00 |
For each work order, employee, and function group, an employee can insert a Start and a Stop row (this is basically how the partition is defined). They must always be in the order of Start then Stop and cannot be backwards. This data collection will all start on a specific date/time, so there is a clean point where all the data must come in beginning with a Start. They can insert these pairs as many times as the need to. I need to write a query that checks to see if there is a problem with the pairings and if so - return the first row where this problem appeared.
In the table above, the last 2 sections show a potential problem and the row that must be returned. The main challenge here is just figuring out how to determine which rows go together as a pair. One alternative way I am thinking of handling this is to simply insert a unique ID with every Start, then insert that same ID with every Stop. This might work, but for now I need a query that can show me problems for the test data I'm using.
So based on the expectation that the data should be in start/stop pairs, you can allocate a row number and then compare the expected FunctionType with the actual FunctionType, or rather, since you want the error to appear on the line before, the next FunctionType (using LEAD
) with the expected FunctionType.
declare @TestData table (WorkOrder varchar(3), Employee varchar(4), FunctionGroup varchar(6), FunctionType varchar(5), [Timestamp] datetime)
insert into @TestData (WorkOrder, Employee, FunctionGroup, FunctionType, [Timestamp])
values
('WO1','Emp1','Group1','Start','7/27/23 09:00'),
('WO1','Emp1','Group1','Stop','7/27/23 10:00'),
('WO1','Emp1','Group1','Start','7/27/23 11:00'),
('WO1','Emp1','Group1','Stop','7/27/23 12:00'),
('WO2','Emp2','Group2','Start','7/27/23 13:00'),
('WO2','Emp2','Group2','Stop','7/27/23 14:00'),
('WO2','Emp2','Group2','Start','7/27/23 15:00'),
('WO2','Emp2','Group2','Stop','7/27/23 16:00'),
('WO3','Emp3','Group3','Start','7/27/23 17:00'),-- (problem here: since the next row is also a Start, then this row should be returned)
('WO3','Emp3','Group3','Start','7/27/23 18:00'),
('WO3','Emp3','Group3','Start','7/27/23 19:00'),
('WO3','Emp3','Group3','Stop','7/27/23 20:00'),
('WO4','Emp4','Group4','Stop','7/27/23 17:00'),-- (problem here: since the dataset for this partition starts with a Stop instead of a Start, then this row should be returned)
('WO4','Emp4','Group4','Start','7/27/23 18:00'),
('WO4','Emp4','Group4','Start','7/27/23 19:00'),
('WO4','Emp4','Group4','Stop','7/27/23 20:00');
with cte as (
select *
, row_number() over (partition by WorkOrder, Employee, FunctionGroup order by [Timestamp]) rn
, lead(FunctionType) over (partition by WorkOrder, Employee, FunctionGroup order by [Timestamp]) FunctionTypeLead
from @TestData
)
select WorkOrder, Employee, FunctionGroup, FunctionType, [Timestamp]
from cte
where rn%2 = 1 and FunctionTypeLead != 'Stop'
order by WorkOrder, Employee, FunctionGroup, [Timestamp];
Returns:
WorkOrder | Employee | FunctionGroup | FunctionType | Timestamp |
---|---|---|---|---|
WO3 | Emp3 | Group3 | Start | 2023-07-27 17:00:00.000 |
WO4 | Emp4 | Group4 | Stop | 2023-07-27 17:00:00.000 |
Note: Providing the DDL+DML (as shown here) makes it much easier to answer.