Search code examples
sqlsql-servert-sqlgaps-and-islands

Records with DateFrom and DateTo - date range optimalization


Here is a reproducible example of what I want to achieve. Basically I have a datefrom column, dateto column and some value columns. Data in table:

Name DateFrom DateTo Role
John 2000-01-01 2001-11-22 driver
John 2000-05-01 2001-10-22 driver
John 2002-12-01 2002-12-12 driver
John 2000-12-01 2005-12-12 teacher
John 2000-09-01 2001-12-01 driver
John 1999-12-01 2000-01-01 driver
George 1999-12-01 2000-01-01 driver

I need to merge overlapping ranges within the same Name and Role, to get this output:

Name DateFrom DateTo Role
John 1999-12-01 2001-12-01 driver
John 2002-12-01 2002-12-12 driver
John 2000-12-01 2005-12-12 teacher
George 1999-12-01 2000-01-01 driver

Thank you


Solution

  • This is a classic gaps-and-islands problem. There are many different approaches.

    A common solution, if somewhat inefficient, is to use LAG and/or LEAD to identify the beginning of each group of rows, then assign group numbers using a conditional windowed COUNT. Then simply group by the group number

    WITH NewIntervals AS (
        SELECT *,
          isNew = CASE WHEN LAG(DateTo) OVER (PARTITION BY Name ORDER BY DateFrom) >= DateFrom
                        AND LAG(Role)   OVER (PARTITION BY Name ORDER BY DateFrom) = Role
                       THEN NULL ELSE 1 END
        FROM YourTable t
    ),
    Groups AS (
        SELECT *,
          grp = COUNT(isNew) OVER (PARTITION BY Name ORDER BY DateFrom ROWS UNBOUNDED PRECEDING)
        FROM NewIntervals
    )
    SELECT
      Name,
      Role = MIN(Role),
      DateFrom = MIN(DateFrom),
      DateTo = MAX(DateTo)
    FROM Groups
    GROUP BY
      Name,
      grp;
    

    db<>fiddle