Search code examples
sqlsql-servert-sqlwindow-functions

Count blocks of data based on id and status code


I have a dataset that looks something like this

id status datetime
123456 0 07/02/2023 12:43
123456 4 07/02/2023 12:49
123456 5 07/02/2023 12:58
123456 5 07/02/2023 13:48
123456 7 07/02/2023 14:29
123456 0 07/02/2023 14:50
123456 4 07/02/2023 14:50
123456 5 07/02/2023 14:51
123456 9 07/02/2023 15:27
567890 0 07/02/2023 11:44
567890 4 07/02/2023 12:23
567890 5 07/02/2023 12:29
567890 5 07/02/2023 13:26
567890 5 07/02/2023 13:28
567890 5 07/02/2023 13:28
567890 5 07/02/2023 13:29
567890 9 07/02/2023 13:55

For each id in the dataset there are a number of statuses that need to be identified as 'blocks' of activity, where each block starts with a status code of 0 (and is sorted by datetime)

What I'd like to do is to add a column that identifies this block. So my data would look like this with that column added.

id status datetime block
123456 0 07/02/2023 12:43 1
123456 4 07/02/2023 12:49 1
123456 5 07/02/2023 12:58 1
123456 5 07/02/2023 13:48 1
123456 7 07/02/2023 14:29 1
123456 0 07/02/2023 14:50 2
123456 4 07/02/2023 14:50 2
123456 5 07/02/2023 14:51 2
123456 9 07/02/2023 15:27 2
567890 0 07/02/2023 11:44 1
567890 4 07/02/2023 12:23 1
567890 5 07/02/2023 12:29 1
567890 5 07/02/2023 13:26 1
567890 5 07/02/2023 13:28 1
567890 5 07/02/2023 13:28 1
567890 5 07/02/2023 13:29 1
567890 9 07/02/2023 13:55 1

I've used window functions before, but I can't get my head around how to do this.


Solution

  • You may get the desired result using CTE and the below query

    Sample Table:

    CREATE TABLE SampleData (
      id INT,
      status INT,
      datetime DATETIME
    );
    

    Sample Data:

    INSERT INTO SampleData (id, status, datetime)
    VALUES
      (123456, 0, '07/02/2023 12:43'),
      (123456, 4, '07/02/2023 12:49'),
      (123456, 5, '07/02/2023 12:58'),
      (123456, 5, '07/02/2023 13:48'),
      (123456, 7, '07/02/2023 14:29'),
      (123456, 0, '07/02/2023 14:50'),
      (123456, 4, '07/02/2023 14:50'),
      (123456, 5, '07/02/2023 14:51'),
      (123456, 9, '07/02/2023 15:27'),
      (567890, 0, '07/02/2023 11:44'),
      (567890, 4, '07/02/2023 12:23'),
      (567890, 5, '07/02/2023 12:29'),
      (567890, 5, '07/02/2023 13:26'),
      (567890, 5, '07/02/2023 13:28'),
      (567890, 5, '07/02/2023 13:28'),
      (567890, 5, '07/02/2023 13:29'),
      (567890, 9, '07/02/2023 13:55');
    

    Query:

      WITH CteSampleData AS (
      SELECT *,
             SUM(CASE WHEN status = 0 THEN 1 ELSE 0 END)
             OVER (PARTITION BY id ORDER BY datetime) AS block
      FROM SampleData
    )
    SELECT id, status, datetime, block
    FROM CteSampleData
    ORDER BY id, datetime;