Search code examples
mysqlsqldatecountrecursive-query

Using a MySQL query to count records with time ranges at regular intervals


Given a MySQL table that tracks how long a visitor spent at a location, what would be a suitable query to count the total visitors at 5 minute intervals over the course of a day?

+-----------------+------------------+------+-----+---------+-------+
| Field           | Type             | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+-------+
| end_timestamp   | int(10) unsigned | NO   | PRI | NULL    |       |
| start_timestamp | int(10) unsigned | NO   | PRI | NULL    |       |
| visitor_id      | int(10) unsigned | NO   | PRI | NULL    |       |
| location_id     | int(10) unsigned | NO   | PRI | NULL    |       |
+-----------------+------------------+------+-----+---------+-------+

E.g. results might look like this:

+---------------------+-------------------+
| Timestamp           | COUNT(visitor_id) |
+---------------------+-------------------+
| 2020-01-01 00:00:00 | 45                |
| 2020-01-01 00:05:00 | 49                |
| 2020-01-01 00:10:00 | 37                |
...

This is something i'm currently calculating post-query but looking to shift some of the work to the MySQL server by doing it as part of the database query.


Solution

  • If you are running MySQL 8.0, you can use a recursive query to generates the intervals, then bring your table with a left join, and finally aggregate.

    The following query gives you the information that you want for the current day (you can change current_date to some other date as needed):

    with all_ts as (
        select current_date ts
        union all
        select ts + interval 5 minute 
        from all_ts 
        where ts < current_date + interval 1 day
    )
    select a.ts, count(t.visitor_id) no_visitors
    from all_ts a
    left join mytable t
        on  t.start_timestamp >= a.ts
        and t.end_timestamp   <  a.ts 
    group by a.ts
    

    If you are storing your dates as unix timestamps, you can change the left join as follows:

    left join mytable t
        on  t.start_timestamp >= unix_timestamp(a.ts)
        and t.end_timestamp   <  unix_timestamp(a.ts)