Search code examples
mysqlsqldate-rangegaps-and-islandsmysql-5.6

MYSQL SUM durations within consecutive hours


Using an older version of MySQL where the WITH clause is not valid.

Starting with a table:

+--------+---------------------+---------------------+
| person | start_time          | end_time            |
+--------+---------------------+---------------------+
| Alice  | 2020-02-27 20:00:00 | 2020-02-27 20:59:59 |
| Alice  | 2020-02-27 23:45:00 | 2020-02-27 23:59:59 |
| Alice  | 2020-02-28 00:00:00 | 2020-02-28 00:59:59 |
| Alice  | 2020-02-28 01:00:00 | 2020-02-28 01:59:59 |
| Bob    | 2020-02-27 23:45:00 | 2020-02-27 23:59:59 |
| Cindy  | 2020-02-28 02:00:00 | 2020-02-28 02:59:59 |
| Cindy  | 2020-02-28 03:00:00 | 2020-02-28 03:36:59 |
+--------+---------------------+---------------------+

I'd like a query to sum all durations-per-person that fall within an hour of each other.

+--------+---------------------+---------------------+----------+
| person | start_time          | end_time            | duration |
+--------+---------------------+---------------------+----------+
| Alice  | 2020-02-27 20:00:00 | 2020-02-27 20:59:59 |     3599 |
| Alice  | 2020-02-27 23:45:00 | 2020-02-28 01:59:59 |     8064 |
| Bob    | 2020-02-27 23:45:00 | 2020-02-27 23:59:59 |      899 |
| Cindy  | 2020-02-28 02:00:00 | 2020-02-28 03:36:59 |     5806 |
+--------+---------------------+---------------------+----------+

Solution

  • E.g. - although, as written, this solution is exclusively for versions of MySQL prior to 8.0...

    DROP TABLE IF EXISTS my_table;
    
    CREATE TABLE my_table
    (person VARCHAR(12) NOT NULL
    ,start_time DATETIME NOT NULL
    ,end_time DATETIME NOT NULL
    ,PRIMARY KEY(person,start_time)
    );
    
    INSERT INTO my_table VALUES
    ('Alice','2020-02-27 20:00:00','2020-02-27 20:59:59'),
    ('Alice','2020-02-27 23:45:00','2020-02-27 23:59:59'),
    ('Alice','2020-02-28 00:00:00','2020-02-28 00:59:59'),
    ('Alice','2020-02-28 01:00:00','2020-02-28 01:59:59'),
    ('Bob','2020-02-27 23:45:00','2020-02-27 23:59:59'),
    ('Cindy','2020-02-28 02:00:00','2020-02-28 02:59:59'),
    ('Cindy','2020-02-28 03:00:00','2020-02-28 03:36:59');
    
    SELECT person
         , MIN(start_time) start_time
         , MAX(end_time) end_time
         , SUM(TIME_TO_SEC(TIMEDIFF(end_time,start_time))) delta 
      FROM 
         ( SELECT x.*
                , CASE WHEN person = @prev_person 
                       THEN CASE WHEN start_time <= @prev_end_time + INTERVAL 1 HOUR 
                                 THEN @i:=@i 
                                 ELSE @i:=@i+1 END 
                       ELSE @i:=1 END i
                , @prev_person := person
                , @prev_end_time := end_time
             FROM my_table x
                , (SELECT @prev_person := null, @prev_end_time := null, @i:=0) vars 
            ORDER 
               BY person
                , start_time
         ) a
     GROUP  
        BY person,i;
    +--------+---------------------+---------------------+-------+
    | person | start_time          | end_time            | delta |
    +--------+---------------------+---------------------+-------+
    | Alice  | 2020-02-27 20:00:00 | 2020-02-27 20:59:59 |  3599 |
    | Alice  | 2020-02-27 23:45:00 | 2020-02-28 01:59:59 |  8097 |
    | Bob    | 2020-02-27 23:45:00 | 2020-02-27 23:59:59 |   899 |
    | Cindy  | 2020-02-28 02:00:00 | 2020-02-28 03:36:59 |  5818 |
    +--------+---------------------+---------------------+-------+
    

    FWIW, I think rewriting the query this way renders it 'version agnostic', i.e. impervious to the fair accusation that the order of evaluation of elements is not guaranteed - but I might be wrong. Regardless, in MySQL 8.0+ the below can be rewritten with the extended functionality afforded by that version.

    SELECT person
         , MIN(start_time) start_time
         , MAX(end_time) end_time
         , SUM(TIME_TO_SEC(TIMEDIFF(end_time,start_time))) delta 
      FROM 
      ( SELECT * FROM
         ( SELECT x.*
                , CASE WHEN person = @prev_person 
                       THEN CASE WHEN start_time <= @prev_end_time + INTERVAL 1 HOUR 
                                 THEN @i:=@i 
                                 ELSE @i:=@i+1 END 
                       ELSE @i:=1 END i
                , @prev_person := person
                , @prev_end_time := end_time
             FROM my_table x
                , (SELECT @prev_person := null, @prev_end_time := null, @i:=0) vars 
         ) k
          ORDER 
                    BY person
                , start_time
         ) a
     GROUP  
        BY person,i;