Search code examples
sql-servert-sqlcommon-table-expressionrecursive-querygaps-and-islands

How do I make my CTE search through my data again in T-SQL/MSSQL?


Currently, I have a list of time entries for employees that looks like this:

TimeSheetId   TechnicianUserId    TimeSheetDate   FromDatetime              ToDatetime
    1215286               4730    2020-11-10      2020-11-10 14:15:00.000   2020-11-10 15:15:00.000
    1215965               4730    2020-11-10      2020-11-10 15:15:00.000   2020-11-10 15:45:00.000
    1215969               4730    2020-11-10      2020-11-10 15:45:00.000   2020-11-10 17:45:00.000
    1215972               4730    2020-11-10      2020-11-10 17:45:00.000   2020-11-10 23:45:00.000
    1215967               4730    2020-11-10      2020-11-10 23:45:00.000   2020-11-10 23:59:00.000
    1215968               4730    2020-11-11      2020-11-11 00:00:00.000   2020-11-11 00:15:00.000
    1215978               4730    2020-11-11      2020-11-11 00:15:00.000   2020-11-11 00:30:00.000
    1215980               4730    2020-11-11      2020-11-11 16:00:00.000   2020-11-11 16:30:00.000
    1215979               4735    2020-11-11      2020-11-11 00:30:00.000   2020-11-11 08:30:00.000

Because of how our time is processed, I want rows 6 and 7 to have their TimeSheetDate rolled back a day, since it's technically contiguous time with the rows before it, and it is handled by our system as all occurring on November 10 instead of November 11. It should not roll back row 8 (not immediately after row 7) or row 9 (different employee number).

When I try a nested query, I only catch row 6. When I attempt a CTE without the TimeSheetId (since I'm not sure how to implement a way to count on it), it doesn't know when to stop iterating and bombs out. How can I make sure I catch both rows 6 & 7?

CTE I've attempted:

;WITH CTEDummyData (
                     [TechnicianUserId]
                    ,[TimeSheetDate]
                    ,[FromDatetime]
                    ,[ToDatetime]
                    )
AS (
SELECT [TechnicianUserId]
      ,[TimeSheetDate]
      ,[FromDatetime]
      ,[ToDatetime]
  FROM @DummyTime
 UNION ALL
SELECT [TechnicianUserId]
      ,CASE
            WHEN DATEDIFF(MINUTE,LAG([ToDateTime]) OVER (ORDER BY [FromDatetime]),[FromDatetime]) < 2
                    AND CAST([FromDatetime] AS DATE) <> LAG([TimeSheetDate]) OVER (ORDER BY [FromDatetime])
            THEN DATEADD(DAY,-1,[TimeSheetDate])
            ELSE [TimeSheetDate]
        END
      ,[FromDatetime]
      ,[ToDatetime]
  FROM CTEDummyData
    )
SELECT *
  FROM CTEDummyData
OPTION (MAXRECURSION 24)

Solution

  • I attempted a solution using SQL Loops to update the data in iterations until it's all like you want it.

    DECLARE @NOMOREFORMATTINGNEEDED AS BIT = 0;
    IF (EXISTS(SELECT * FROM SYS.tables WHERE name like '%TempDummyTimeTable%'))
    BEGIN
        Drop table #TempDummyTimeTable;
    END
    SELECT * INTO #TempDummyTimeTable FROM DummyTimeTable;
    
    WHILE (@NOMOREFORMATTINGNEEDED = 0)
    BEGIN
        update tl set TimeSheetDate = t.TimeSheetDate from #TempDummyTimeTable tl 
                            join #TempDummyTimeTable t on   tl.TechnicianUserId = t.technicianuserid 
                            and ABS(DATEDIFF (MINUTE, tl.FromDatetime, t.ToDatetime)) <= 1
                            and tl.TimeSheetDate <> t.TimeSheetDate 
    
        IF (not exists(select * from #TempDummyTimeTable tl JOIN #TempDummyTimeTable t on   tl.TechnicianUserId = t.technicianuserid 
                            and ABS(DATEDIFF (MINUTE, tl.FromDatetime, t.ToDatetime)) <= 1
                            and tl.TimeSheetDate <> t.TimeSheetDate ) )
        BEGIN
            SET @NOMOREFORMATTINGNEEDED = 1
        END
    END
    
    select * from #TempDummyTimeTable
    

    Basically you create a new temporary table out of the original table. You then check if there's any instance of a row that is continuous with another row for the same employee but has different TimeSheetDate, and you then update the TimeSheetDate of it. You keep doing this in the loop until the flag @NOMOREFORMATTINGNEEDED is set to true. What sets this flag to true is a check that verifies there are no more continuous records with different TimeSheetDate. I checked that the FromDateTime of one is within one minute of the ToDateTime of the other for this check. Let me know if this works for you.