Search code examples
sqlsql-serverdatewindow-functionsgaps-and-islands

How to group by continuous records in SQL


My table has these records

ID  Colour
------------
 1   Red
 2   Red
 3   Red
 4   Red
 5   Red
 6   Green
 7   Green
 8   Green
 9   Green
10   Red
11   Red
12   Red
13   Red
14   Green
15   Green
16   Green
17   Blue
18   Blue
19   Red
20   Blue

I can group by colour easily like this

SELECT Colour, MIN(ID) AS iMin, MAX(ID) AS iMax
FROM MyTable
GROUP BY Colour

This would return this result

Colour     iMin     iMax
-------------------------
Red        1        19
Green      6        16
Blue       17       20

But this is not what I want as Red does not go all the way from 1 to 19, Green breaks the sequence.

The result should be like this

Colour     iMin     iMax
------------------------
Red        1        5
Green      6        9
Red        10       13
Green      14       16
Blue       17       18
Red        19       19
Blue       20       20

I managed to do this by cursor, but wonder if there is a more efficient way to do that


Solution

  • This is a gaps-and-islands problem. Assuming that id is continously incrementing, you can use the difference between row_number() to define groups of "adjacent" records having the same colour:

    select 
        colour, 
        min(id) iMin,
        max(id) iMax
    from (
        select t.*, row_number() over(partition by colour order by id) rn
        from mytable t
    ) t
    group by colour, id - rn
    order by min(id)
    

    Demo on DB Fiddle:

    colour | iMin | iMax
    :----- | ---: | ---:
    Red    |    1 |    5
    Green  |    6 |    9
    Red    |   10 |   13
    Green  |   14 |   16
    Blue   |   17 |   18
    Red    |   19 |   19
    Blue   |   20 |   20