Search code examples
sql-serversql-server-2008gaps-and-islands

How to find the boundaries of groups of contiguous sequential numbers?


I have a table with the following definition

CREATE TABLE mytable
  (
     id     INT IDENTITY(1, 1) PRIMARY KEY,
     number BIGINT,
     status INT
  )

and example data

INSERT INTO mytable
VALUES (100,0),
       (101,0),
       (102,0),
       (103,0),
       (104,1),
       (105,1),
       (106,0),
       (107,0),
       (1014,0),
       (1015,0),
       (1016,1),
       (1017,0)

Looking only at the rows where status = 0 how can I collapse the Number values into ranges of contiguous sequential numbers and find the start and end of each range?

i.e. For the example data the results would be

         FROM      to 
Number    100      103
Number    106      107
Number    1014     1015
Number    1017     1017

Solution

  • As mentioned in the comments this is a classic gaps and islands problem.

    A solution popularized by Itzik Ben Gan is to use the fact that DENSE_RANK() OVER (ORDER BY number) - number remains constant within an "island" and cannot appear in multiple islands.

    WITH T
         AS (SELECT DENSE_RANK() OVER (ORDER BY number) - number AS Grp,
                    number
             FROM   mytable
             WHERE  status = 0)
    SELECT MIN(number) AS [From],
           MAX(number) AS [To]
    FROM   T
    GROUP  BY Grp
    ORDER  BY MIN(number);
    

    On later versions my preference is to use LAG/LEAD for this to reduce the amount of sorting going on. The below should work from 2012+

    WITH T AS
    (
    SELECT *,
           MinNumber  = MIN(number) OVER (ORDER BY number ROWS UNBOUNDED PRECEDING),
           NextNumber = LEAD(number) OVER (ORDER BY number)
    FROM mytable
    WHERE status = 0
    )
    SELECT LAG(NextNumber, 1, MinNumber) OVER (ORDER BY number) AS [From], 
           number AS [To]
    FROM T
    WHERE NextNumber IS NULL OR NextNumber <> number+ 1