Search code examples
phpmysqlgroup-bydelta

Mysql group by where delta between records X


I need help to write MySQL query.

I have table full of logs where one of the column is unix timestamp. I want to group (GROUP BY) those records so that events that were made in close range time (i.e. 5 sec) between each of them are in one group.

For example:

Table:

timestamp
----------
1429016966
1429016964
1429016963
1429016960
1429016958
1429016957
1429016950
1429016949
1429016943
1429016941
1429016940
1429016938

Become to groups like that:

GROUP_CONCAT(timestamp)                                            | COUNT(*)
----------------------------------------------------------------------------- 
1429016966,1429016964,1429016963,1429016960,1429016958,1429016957  |    6
1429016950,1429016949                                              |    2                         
1429016943,1429016941,1429016940,1429016938                        |    4

Of course I can work with the data array afterwards in php, but I think that mysql would do it faster.


Solution

  • I started by using a variable to get the position of each row, where 1 is the highest time column and ending with the lowest, like this:

    SET @a := 0;
    
    SELECT timeCol, @a := @a + 1 AS position
    FROM myTable
    ORDER BY timeCol DESC;
    

    For simplicity, we will call this positionsTable so that the rest of the query will be more readable. Once I created that table, I used a 'time_group' variable that checked if a previous row was within the last 5 seconds. If it was, we keep the same time_group. It sounds ugly, and looks kind of ugly, but it's like this:

    SELECT m.timeCol, m.position,
      CASE WHEN (SELECT p.timeCol FROM positionsTable p WHERE p.position = m.position - 1) <= m.timeCol + 5 
        THEN @time_group 
        ELSE @time_group := @time_group + 1 END AS timeGroup
    FROM positionsTable m;
    

    And then ultimately, using that as a subquery, you can group them:

    SELECT GROUP_CONCAT(timeCol), COUNT(*)
    FROM(
      SELECT m.timeCol, m.position,
        CASE WHEN (SELECT p.timeCol FROM positionsTable p WHERE p.position = m.position - 1) <= m.timeCol + 5 
        THEN @time_group 
        ELSE @time_group := @time_group + 1 END AS timeGroup
      FROM positionsTable m) tmp
    GROUP BY timeGroup;
    

    Here is an SQL Fiddle example.