Search code examples
sqlazure-sql-database

Add a grouping column to keep time entries together based on on its value, then repeat the id until the next value occurs


I have time punch records for employees (in a SQL Table) and I'm looking for a way to group them together to identify the start and end of their shift. In some cases, employees will work an over night shift starting on one day and ending on the next, then they start their new shift that evening. In other cases, they would start and end on the same day. I need to keep the records together as shown in last column below. I was hoping could use the Lag() column and generate a new value in *New Group column (shown in Table) each time the Lag() Hrs > 10 or something as it represents the number of hours between last value of EndLocal and current value of StartLocal.

ID EmpId DateLocal StartLocal EndLocal Lag() Hrs *New Group
1281052 6917 1/1/2025 1/1/25 00:00 1/1/25 04:00 0 Shift 12/31/24
1281752 6917 1/1/2025 1/1/25 05:46 1/1/25 07:35 1 Shift 12/31/24
1283216 6917 1/1/2025 1/1/25 22:19 1/2/25 00:00 15 Shift 01/01/25
1284582 6917 1/2/2025 1/2/25 00:00 1/2/25 04:10 0 Shift 01/01/25
1285299 6917 1/2/2025 1/2/25 05:10 1/2/25 07:32 1 Shift 01/01/25
1300771 6917 1/6/2025 1/6/25 22:14 1/2/25 00:00 111 Shift 01/06/25
1303070 6917 1/7/2025 1/6/25 00:00 1/7/25 00:00 0 Shift 01/06/25
1303560 6917 1/7/2025 1/7/25 04:22 1/7/25 08:01 1 Shift 01/06/25

I tried using ROW_NUMBER Over but I couldn't figure out how to force it to generate the id when the lag() Hrs column was greater than 10.

with cte_punches as (
        SELECT
            t.Id,
            e.EmployeeId,
            t.DateLocal,
            t.StartLocal,
            t.EndLocal,
            COALESCE(ABS(DATEDIFF(hh, LAG(t.EndLocal, 1,NULL) OVER(PARTITION BY e.EmployeeId Order BY t.Id),t.StartLocal)), 0) ClockedHours
        FROM
            dbo.prod_bamboo_timesheetentries AS t
            LEFT OUTER JOIN dbo.prod_bamboo_employeelist AS e ON t.EmployeeId = e.Id 
        WHERE
            (UPPER(e.Location) LIKE '%PH%'
            AND NOT UPPER(t.TaskName) LIKE '%BREAK%'
            AND NOT UPPER(t.TaskName) LIKE '%LUNCH%'
            OR t.TaskName IS NULL)
                and e.EmployeeId = '6917' --'5526'
                and t.DateLocal between '2025-01-01' and '2025-01-10'
        )
SELECT
    Id,
    EmployeeId,
    DateLocal,
    StartLocal,
    EndLocal,
    ClockedHours,
    row_number() OVER(PARTITION BY ClockedHours ORDER BY Id) GroupIdentifier
FROM
    cte_punches
ORDER BY 
    EmployeeId,
    Id

Solution

  • The trick is to sum 1 every time your lagged value is large enough to trigger a new group e.g.

    I have assumed you are looking for the first date in the group when creating the shift name, but that doesn't match the results you have provided.

    create table cte_punches (ID int, EmpId int, DateLocal date, StartLocal datetime2, EndLocal datetime2, LaggedHours int)
    
    insert into cte_punches (ID,    EmpId,  DateLocal,  StartLocal, EndLocal, LaggedHours)
    values
    (1281052,   6917,   '1/1/2025', '1/1/25 00:00', '1/1/25 04:00', 0),
    (1281752,   6917,   '1/1/2025', '1/1/25 05:46', '1/1/25 07:35', 1),
    (1283216,   6917,   '1/1/2025', '1/1/25 22:19', '1/2/25 00:00', 15),
    (1284582,   6917,   '1/2/2025', '1/2/25 00:00', '1/2/25 04:10', 0),
    (1285299,   6917,   '1/2/2025', '1/2/25 05:10', '1/2/25 07:32', 1),
    (1300771,   6917,   '1/6/2025', '1/6/25 22:14', '1/2/25 00:00', 111),
    (1303070,   6917,   '1/7/2025', '1/6/25 00:00', '1/7/25 00:00', 0),
    (1303560,   6917,   '1/7/2025', '1/7/25 04:22', '1/7/25 08:01', 1);
    
    with cte as (
      select *
        -- If we add 1 to our sum every time a new shift starts we get our group
        , sum(case when LaggedHours > 10 then 1 else 0 end) over (partition by EmpId order by DateLocal asc, StartLocal) GroupNum
      from cte_punches
    )
    select *
      -- Then calculate the shift name, in this case taking the first date in the shift
      , 'shift-' + convert(varchar(12), min(DateLocal) over (partition by GroupNum)) ShiftName
    from cte
    order by DateLocal asc, StartLocal asc;
    
    ID EmpId DateLocal StartLocal EndLocal LaggedHours GroupNum ShiftName
    1281052 6917 2025-01-01 2025-01-01 00:00:00.0000000 2025-01-01 04:00:00.0000000 0 0 shift-2025-01-01
    1281752 6917 2025-01-01 2025-01-01 05:46:00.0000000 2025-01-01 07:35:00.0000000 1 0 shift-2025-01-01
    1283216 6917 2025-01-01 2025-01-01 22:19:00.0000000 2025-01-02 00:00:00.0000000 15 1 shift-2025-01-01
    1284582 6917 2025-01-02 2025-01-02 00:00:00.0000000 2025-01-02 04:10:00.0000000 0 1 shift-2025-01-01
    1285299 6917 2025-01-02 2025-01-02 05:10:00.0000000 2025-01-02 07:32:00.0000000 1 1 shift-2025-01-01
    1300771 6917 2025-01-06 2025-01-06 22:14:00.0000000 2025-01-02 00:00:00.0000000 111 2 shift-2025-01-06
    1303070 6917 2025-01-07 2025-01-06 00:00:00.0000000 2025-01-07 00:00:00.0000000 0 2 shift-2025-01-06
    1303560 6917 2025-01-07 2025-01-07 04:22:00.0000000 2025-01-07 08:01:00.0000000 1 2 shift-2025-01-06

    fiddle