Search code examples
mysqlsqldategroup-bydatediff

SQL get range of dates in between given dates grouped by another column in a table


In this table -

----------------------------------------------
ID  | user   | type   | timestamp
----------------------------------------------
1   | 1      | 1      | 2019-02-08 15:00:00
2   | 1      | 3      | 2019-02-15 15:00:00
3   | 1      | 2      | 2019-03-06 15:00:00
4   | 2      | 3      | 2019-02-01 15:00:00
5   | 2      | 1      | 2019-02-06 15:00:00
6   | 3      | 1      | 2019-01-10 15:00:00
7   | 3      | 4      | 2019-02-08 15:00:00
8   | 3      | 3      | 2019-02-24 15:00:00
9   | 3      | 2      | 2019-03-04 15:00:00
10  | 3      | 3      | 2019-03-05 15:00:00

I need to find the number of days every user has been in a particular type in the given range of days.

Eg: For the given range 2019-02-01 to 2019-03-04, the output should be

--------------------------------
user   | type   | No. of days
--------------------------------
1      | 1      | 7
1      | 3      | 17
2      | 3      | 6
3      | 1      | 29
2      | 4      | 16
2      | 3      | 8

The use can switch between types at any day but I need to capture all those switches and the number of days the user has been in a type. I currently solve this by getting all the values and filtering stuff manually in JS. Is there any way to do this by a SQL query? I use MYSQL 5.7.23.

EDIT:

The above output is incorrect but really appreciate everyone overlooking that and helping me with the right query. Here is the correct output for this question -

--------------------------------
user | type | No. of days
--------------------------------
   1 |    1 |          7
   1 |    3 |         19
   2 |    3 |          5
   3 |    1 |         29
   3 |    2 |          1
   3 |    3 |          8
   3 |    4 |         16

Solution

  • Use lead() and then datediff() and sum() and a lot of date comparisons:

    select user, type,
           sum(datediff( least(next_ts, '2019-03-04'), greatest(timestamp, '2019-02-01'))
    from (select t.*,
                 lead(timestamp, 1, '2019-03-04') over (partition by user order by timestamp) as next_ts
          from t
         ) t
    where next_ts >= '2019-02-01' and
          timestamp <= '2019-03-04'
    group by user, type;
    

    EDIT:

    In older versions, you can use:

    select user, type,
           sum(datediff( least(next_ts, '2019-03-04'), greatest(timestamp, '2019-02-01'))
    from (select t.*,
                 (select coalesce(min(timestamp), '2019-03-04')
                   from t t2
                   where t2.user = t.user and t2.timestamp > t.timestamp
                 ) as next_ts
          from t
         ) t
    where next_ts >= '2019-02-01' and
          timestamp <= '2019-03-04'
    group by user, type;