Search code examples
sqlsql-servert-sqlgroup-bysql-server-2017

TSQL - GROUP BY on continous rows only


I am working on a SQL Server 2017 (v14.0).

I have a table like this:

Key | State    | from       | until      |
----+----------+------------+------------+
100 | open     | 01.01.2021 | 01.01.2021 |
100 | open     | 02.01.2021 | 02.01.2021 |
100 | closed   | 03.01.2021 | 13.01.2021 |
100 | open     | 14.01.2021 | 20.01.2021 |
100 | open     | 20.01.2021 | 30.01.2021 |

I want to group it by Key and State, but only for continuous rows.

So my expected result would be something like:

Key | State    | from       | until      |
----+----------+------------+------------+
100 | open     | 01.01.2021 | 02.01.2021 |
100 | closed   | 03.01.2021 | 13.01.2021 |
100 | open     | 14.01.2021 | 30.01.2021 |

Any idea on how to do this? I have the strong feeling, that this should be possible with the help of ROW_NUMBER somehow, but I was not able to figure it out yet...

(In this example data some weird group by calendarweek or something similar might be possible, but this is not my intention)


Solution

  • It is a Gaps and Islands problem. One solution is this:

    WITH cte1 AS (
        SELECT *, CASE WHEN LAG([state]) OVER (PARTITION BY [key] ORDER BY [from]) = [state] THEN 0 ELSE 1 END AS chg
        FROM t
    ), cte2 AS (
        SELECT *, SUM(chg) OVER (PARTITION BY [key] ORDER BY [from]) AS grp
        FROM cte1
    )
    SELECT [key], grp, MIN([state]), MIN([from]), MAX([until])
    FROM cte2
    GROUP BY [key], grp
    ORDER BY [key], grp