Search code examples
sqlpostgresqlsubqueryaggregate-functions

Aggregate daily sum over list of durations


Current situation

I have a table that tracks the usage of a certain resource. It looks like this:

started   | timestamp with time zone | not null
last_ping | timestamp with time zone |
stopped   | timestamp with time zone |

The stopped and last_ping fields may be empty. If they're both filled, stopped is relevant.

The start and stop dates can span only a few seconds or multiple dates.

Goal

I want to get a list of daily usage over the last 14 days.

Current status

I'm aware that I can get a list of the last 14 days:

SELECT day
FROM generate_series(CURRENT_DATE, CURRENT_DATE - 14, '-1 day'::interval) day;

I can get the total duration for each usage entry:

SELECT COALESCE(stopped, last_ping, started) - started AS duration
FROM api_sessionusage;

I can also combine the two queries and add a limit that only takes into account duration until midnight:

SELECT
  day,
  (
    SELECT SUM(
      LEAST(COALESCE(stopped, last_ping, started), day + interval '1 day') - started
    )
    FROM api_sessionusage
    WHERE started >= day AND started < day + interval '1 day'
  ) AS aggregated_duration
FROM generate_series(CURRENT_DATE, CURRENT_DATE -14, '-1 day'::interval) day;

The problem here is that usage sessions that end after midnight are only counted for the starting day, but the duration after midnight is not taken into account.

How can I rewrite the query so that I get an aggregated usage for each day during the last 14 days?


Solution

  • Use overlap and intersect operators of timestamp range type:

    select 
        day, 
        sum(upper(daily_range) - lower(daily_range))
    from (
        select 
            day, 
            session_range * tstzrange(day, day::date + 1) daily_range
        from generate_series(current_date, current_date -14, '-1 day'::interval) day
        left join (
            select tstzrange(started, coalesce(stopped, last_ping, started)) session_range
            from api_sessionusage
            ) s
        on session_range && tstzrange(day, day::date + 1)
    ) s
    group by 1
    order by 1;
    

    Note.

    coalesce(stopped, last_ping, started) - started as duration
    

    yields zero if both stopped and last_ping are null. Maybe it should be

    coalesce(stopped, last_ping, current_date) --?