Search code examples
sqlsql-serverdatetimerecursive-querysql-server-2019

SQL Flag Rows Occurring 2 Hours After Previous Flag


I have some messy RFID data due to over sensitive antenna's. There is a physical process that tracks an RFID tag moving through different stations in a cycle. An item with an RFID tag can move through the cycle more than one time a day, but it is highly unlikely that it could start the cycle within a two hour window of that first read.

I am trying to either create a flag column to determine when the new cycle begins for an item or return a count of how many times an item has gone through a cycle.

Here is some sample data:

CREATE TABLE [dbo].[samplerfiddata](
    [Item] [nvarchar](50) NOT NULL,
    [Station_Type] [nvarchar](50) NOT NULL,
    [Station_Name] [nvarchar](50) NOT NULL,
    [Timestamp] [datetime2](7) NOT NULL,
    [Trying_to_Create_this_Flag_Column] [nvarchar](50) NOT NULL
) ON [PRIMARY]
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'A', N'Decontamination', N'Decontamination', CAST(N'2020-10-10T06:30:00.0000000' AS DateTime2), N'1')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'A', N'Decontamination', N'Decontamination', CAST(N'2020-10-11T14:30:00.0000000' AS DateTime2), N'1')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'A', N'Washer', N'Washer', CAST(N'2020-10-11T14:45:00.0000000' AS DateTime2), N'0')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'A', N'Decontamination', N'Decontamination', CAST(N'2020-10-11T15:15:00.0000000' AS DateTime2), N'0')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'A', N'Other', N'Decontamination', CAST(N'2020-10-11T23:30:00.0000000' AS DateTime2), N'1')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'A', N'Washer', N'Washer', CAST(N'2020-10-12T00:15:00.0000000' AS DateTime2), N'0')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'A', N'Other', N'Decontamination', CAST(N'2020-10-12T00:45:00.0000000' AS DateTime2), N'0')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'A', N'Other', N'Decontamination', CAST(N'2020-10-13T16:00:00.0000000' AS DateTime2), N'1')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'A', N'Other', N'Decontamination', CAST(N'2020-10-13T16:30:00.0000000' AS DateTime2), N'0')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'A', N'Decontamination', N'Decontamination', CAST(N'2020-10-14T13:30:00.0000000' AS DateTime2), N'1')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'B', N'Other', N'Decontamination', CAST(N'2020-10-12T08:30:00.0000000' AS DateTime2), N'1')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'B', N'Decontamination', N'Decontamination', CAST(N'2020-10-12T14:30:00.0000000' AS DateTime2), N'1')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'B', N'Washer', N'Washer', CAST(N'2020-10-12T14:45:00.0000000' AS DateTime2), N'0')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'B', N'Decontamination', N'Decontamination', CAST(N'2020-10-12T15:15:00.0000000' AS DateTime2), N'0')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'B', N'Decontamination', N'Decontamination', CAST(N'2020-10-12T18:00:00.0000000' AS DateTime2), N'1')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'B', N'Washer', N'Washer', CAST(N'2020-10-13T18:15:00.0000000' AS DateTime2), N'0')
GO
INSERT [dbo].[RFID SQL Problem Sample Data] ([Item], [Station_Type], [Station_Name], [Timestamp], [Trying_to_Create_this_Flag_Column]) VALUES (N'B', N'Decontamination', N'Decontamination', CAST(N'2020-10-13T19:00:00.0000000' AS DateTime2), N'1')
GO

In this data we have two different items moving through the cycle with all scenarios captured. The business logic for a new cycle is defined as when an item is read by an RFID Antenna (Station) where the Station Type = Decontamination or the Station Name = Decontamination and it has been more than two hours since the previous first read of the cycle.

Item | Station_Type    | Station_Name    | Timestamp                   | Trying_to_Create_this_Flag_Column
:--- | :-------------- | :-------------- | :-------------------------- | :--------------------------------
A    | Decontamination | Decontamination | 2020-10-10 06:30:00.0000000 | 1                                
A    | Decontamination | Decontamination | 2020-10-11 14:30:00.0000000 | 1                                
A    | Washer          | Washer          | 2020-10-11 14:45:00.0000000 | 0                                
A    | Decontamination | Decontamination | 2020-10-11 15:15:00.0000000 | 0                                
A    | Other           | Decontamination | 2020-10-11 23:30:00.0000000 | 1                                
A    | Washer          | Washer          | 2020-10-12 00:15:00.0000000 | 0                                
A    | Other           | Decontamination | 2020-10-12 00:45:00.0000000 | 0                                
A    | Other           | Decontamination | 2020-10-13 16:00:00.0000000 | 1                                
A    | Other           | Decontamination | 2020-10-13 16:30:00.0000000 | 0                                
A    | Decontamination | Decontamination | 2020-10-14 13:30:00.0000000 | 1                                
B    | Other           | Decontamination | 2020-10-12 08:30:00.0000000 | 1                                
B    | Decontamination | Decontamination | 2020-10-12 14:30:00.0000000 | 1                                
B    | Washer          | Washer          | 2020-10-12 14:45:00.0000000 | 0                                
B    | Decontamination | Decontamination | 2020-10-12 15:15:00.0000000 | 0                                
B    | Decontamination | Decontamination | 2020-10-12 18:00:00.0000000 | 1                                
B    | Washer          | Washer          | 2020-10-13 18:15:00.0000000 | 0                                
B    | Decontamination | Decontamination | 2020-10-13 19:00:00.0000000 | 1                                

Ultimately, I am trying to return a result set that looks like this:

item | cycles
:--- | -----:
A    |      5
B    |      4

How could this be completed?


Solution

  • If I follow you correctly, you need a recursive query:

    with 
        data as (
            select item, station_type, station_name, timestamp, 
                row_number() over(partition by item order by timestamp) rn
            from samplerfiddata sf
        ),
        cte as (
            select d.*, timestamp first_timestamp 
            from data d 
            where rn = 1
            union all
            select d.*, 
                case when 'Decontamination' in (d.station_type, d.station_name) and d.timestamp > dateadd(hour, 2, c.first_timestamp)
                    then d.timestamp
                    else c.timestamp
                end
            from cte c
            inner join data d on d.item = c.item and d.rn = c.rn + 1
        )
    select *
    from cte
    order by item, timestamp
    

    The idea is to iteratively walk the dataset of each item, while keeping track of the last cycle start (here aliased as first_timestamp); when a "decontamination" event is met that is more than two hours later than the previous one, the cycle timestamp resets.

    You can get the count of cycles by using aggregation in the outer query, like:

    select item, count(*) cnt_cycles
    from cte
    where timestamp = first_timestamp
    group by item
    

    Demo on DB Fiddle