Search code examples
sql-serversql-server-2012sequencepartitioningranking

How to find the consecutive number of rows with the same value, given multiple constraints in SQL?


I am using SQL Server 2012 and my data looks like this:

ActivityID  ActivityCode    Name_of_User    ActivityCode_PrevActivity   isLastActivityByUserSame
1020        B1              ABE             A2                           0
1021        A2              BOB             A1                           0
1022        A2              BOB             A2                           1
1023        B1              ABE             B1                           1
1024        B1              ABE             B1                           1
1025        B2              ABE             B1                           0
1026        B2              CARL            A3                           0
1027        A1              CARL            B2                           0
1028        B8              BOB             A2                           0
1029        A1              CARL            A1                           1
1030        B2              ABE             B2                           1
1031        B4              ABE             B2                           0
1032        B8              BOB             B8                           1
1033        A3              BOB             B8                           0
1034        B4              CARL            A1                           0

Just as a brief overview of what the fields represent:

  • ActivityID: PK of the table
  • ActivityCode: The type of activity of the user
  • Name_of_User: The name of the user
  • ActivityCode_PrevActivity: Activity code of the last activity by the user
  • isLastActivityByUserSame: Binary value to indicating whether the last activity code by the user is the same as the current activity code

Besides that, there are a few other columns that will likely not help with the task at hand.

I would like to generate a running count of the number of consecutive rows with the same value for each row, by Name_of_User. To give you an idea of what that would look like, here is the intended result (notice the new ConsecActivityCount column):

ActivityID  ActivityCode    Name_of_User    ActivityCode_PrevActivity   isLastActivityByUserSame    ConsecActivityCount
1020        B1              ABE             A2                          0                           0
1023        B1              ABE             B1                          1                           1
1024        B1              ABE             B1                          1                           2
1025        B2              ABE             B1                          0                           0
1030        B2              ABE             B2                          1                           1
1031        B4              ABE             B2                          0                           0
1021        A2              BOB             A1                          0                           0
1022        A2              BOB             A2                          1                           1
1028        B8              BOB             A2                          0                           0
1032        B8              BOB             B8                          1                           1
1033        A3              BOB             B8                          0                           0
1026        B2              CARL            A3                          0                           0
1027        A1              CARL            B2                          0                           0
1029        A1              CARL            A1                          1                           1
1034        B4              CARL            A1                          0                           0

I tried to use something like:

Select *, 
DENSE_RANK() OVER (PARTITION BY Name_of_User ORDER BY ActivityID) as ConsecReports
FROM ActivityTable

But, I am not having any luck.

What is the best way to compute such a column?

Thanks in advance!


Solution

  • I think I found a good solution. I first created IDs for sequences where they appeared and then, counted the number of consecutive positive 'isLastActivityByUserSame'.

    Here is the solution:

    SELECT at.*,
       (CASE WHEN at.isLastActivityByUserSame != 0
             THEN SUM(CASE WHEN at.isLastActivityByUserSame != 0 THEN at.isLastActivityByUserSame END) OVER (PARTITION BY Name_of_User, LastSequenceID ORDER BY ActivityID)
        ELSE 0 END) AS ConsecActivityCount
    FROM (select at.*,
             SUM(CASE WHEN isLastActivityByUserSame != 0 THEN 0 ELSE 1 END) OVER (ORDER BY ActivityID) AS LastSequenceID
      FROM ActivityTable at
     ) at ORDER BY Name_of_User, ActivityID;