Search code examples
mysqlsqlgroup-bywindow-functionsgaps-and-islands

How to count how many consecutive days a user is tagged as orange in MYSQL?


I would like to know how to count how many consecutive days up to today a user has been tagged as orange. I have the following


CREATE TABLE `survey_daily` (
  `id` int(11) NOT NULL,
  `user_id` varchar(30) NOT NULL,
  `color` varchar(10) NOT NULL,
  `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;


INSERT INTO `survey_daily` (`id`, `user_id`, `color`, `timestamp`) VALUES
(1, '3236', "ORANGE", '2020-05-12 02:40:59'),
(2, '3236', "WHITE", '2020-05-13 02:40:59'),
(3, '3236', "ORANGE", '2020-05-14 02:40:59'),
(4, '3236', "ORANGE", '2020-05-15 02:40:59'),
(5, '3237', "ORANGE", '2020-05-15 02:40:59'),
(6, '3237', "ORANGE", '2020-05-16 02:40:59'),
(7, '3236', "ORANGE", '2020-05-16 02:40:59');

Fiddle: http://sqlfiddle.com/#!9/40cb26/1.

Basically I have multiple users on a table and I would like to count how many consecutive days a user has been tagged as orange.

In my example, user id 3236 should have 3 consecutive days as orange while user 3237 should have 2 days tagged as orange until today. In case none of them has no record today it will return to 0.

enter image description here

Thank you


Solution

  • SELECT t1.user_id, MAX(1 + DATEDIFF(t2.`timestamp`, t1.`timestamp`)) max_delta
    FROM survey_daily t1
    JOIN survey_daily t2 ON t1.user_id = t2.user_id
    WHERE t1.color = 'ORANGE'
      AND t2.color = 'ORANGE'
      AND t1.`timestamp` <= t2.`timestamp`
      AND NOT EXISTS ( SELECT NULL
                       FROM survey_daily t3
                       WHERE t1.user_id = t3.user_id
                         AND t3.color != 'ORANGE'
                         AND t1.`timestamp` < t3.`timestamp`
                         AND t3.`timestamp` < t2.`timestamp` )
    GROUP BY t1.user_id;
    

    Logic. Take all record pairs for user where the color is ORANGE for both records and none record with another color exists between them. Calculate the distance in days in each pair. Get maximal gap value.

    fiddle (thanks to GMB for a fiddle which source data scripts are taken from).

    PS. If none record with ORANGE color exists for some user then this user will not be returned. If you need in such users too then get a copy of survey_daily table, LEFT JOIN my query to it as a subquery by user_id, then get users from the table and the amount of consecutive days from the subquery (wrap it with COALESCE function for to convern NULL value to zero).