Search code examples
mysqlgroup-bycasecommon-table-expressionwindow-functions

Calculating consecutive occurences in MySQL


I have a quick question in relation to windowing in MySQL

SELECT 
Client,
User,
Date,
Flag,
lag(Date) over (partition by Client,User order by Date asc) as last_date,
lag(Flag) over (partition by Client,User order by Date  asc) as last_flag,
case when Flag = 1 and last_flag = 1 then 1 else 0 end as consecutive
FROM db.tbl

This query returns something like the below. I am trying to work out the number of consecutive times that the Flag column was 1 for each user most recently, if they had 11110000111 then we should take the final three occurences of 1 to determine that they had a consecutive flag of 3 times.

I need to extract the start and end date for the consecutive flag.

How would I go about doing this, can anyone help me :)

enter image description here

If we use the example of 11110000111 then we should extract only 111 and therefore the 3 most recent dates for that customer. So in the below, we would need to take 10.01.2023 as the first date and 24.01.2023 as the last date. The consecutive count should be 3

enter image description here

Output:

enter image description here


Solution

  • Use aggregation and string functions:

    WITH cte AS (
      SELECT Client, User,
             GROUP_CONCAT(CASE WHEN Flag THEN Date END ORDER BY Date) AS dates,
             CHAR_LENGTH(SUBSTRING_INDEX(GROUP_CONCAT(Flag ORDER BY Date SEPARATOR ''), '0', '-1')) AS consecutive
      FROM tablename
      GROUP BY Client, User
    )
    SELECT Client, User,
           NULLIF(SUBSTRING_INDEX(SUBSTRING_INDEX(dates, ',', -consecutive), ',', 1), '') AS first_date,
           CASE WHEN consecutive > 0 THEN SUBSTRING_INDEX(dates, ',', -1) END AS last_date,
           consecutive 
    FROM cte;
    

    Another solution with window functions and conditional aggregation:

    WITH 
      cte1 AS (SELECT *, SUM(NOT Flag) OVER (PARTITION BY Client, User ORDER BY Date) AS grp FROM tablename),
      cte2 AS (SELECT *, MAX(grp) OVER (PARTITION BY Client, User) AS max_grp FROM cte1)
    SELECT Client, User,
           MIN(CASE WHEN Flag THEN Date END) AS first_date,
           MAX(CASE WHEN Flag THEN Date END) AS last_date,
           SUM(Flag) AS consecutive
    FROM cte2 
    WHERE grp = max_grp
    GROUP BY Client, User;
    

    See the demo.