Search code examples
sqlsqlitegaps-and-islands

SQL Query to find the activity time of each user


How can we use this data to calculate the activity of each user in each period?

user_id       date            status 
 101        2018-01-1          1 
 101        2018-01-2          0 
 101        2018-01-3          1   
 101        2018-01-4          1 
 101        2018-01-5          1

Output:

 user_id  start_date     end_date  status  length 
   101    2018-01-1    2018-01-1    1      1
   101    2018-01-2    2018-01-2    0      1
   101    2018-01-3    2018-01-5    1      3

Solution

  • This is a gaps-and-islands problem. You can group them together using by subtracting a sequence:

    select user_id, status, min(date), max(date),
           julianday(max(date)) - julianday(min(date)) as length
    from (select t.*,
                 row_number() over (partition by user_id, status order by date) as seqnum
          from t
         ) t
    group by user_id, status, date(date, '-' || seqnum || ' day')
    order by user_id, length;
    

    Here is a db<>fiddle.