Search code examples
sqlpostgresqlgroup-bycountwindow-functions

PostgreSQL: count rows where condition on start date and end date fits in generated time series


I have data organised like this

CREATE TABLE sandbox.tab_1 (id serial, started timestamp, ended timestamp);
INSERT INTO sandbox.tab_1 (id, started, ended) VALUES
    (1, '2020-01-03'::timestamp, NULL),
    (2, '2020-01-05'::timestamp, '2020-01-06'),
    (3, '2020-01-07'::timestamp,  NULL),
    (4, '2020-01-08'::timestamp,  NULL);

enter image description here

I need to count the number of rows where started >= and ended < than a generated time series that goes from min(started) to max(started). This would give me for each day the stock of started and not ended ids at a given time. The result would be something like this:

enter image description here

Thank you for your help


Solution

  • You can LEFT JOIN the table to the series of timestamps on the start being less then or equal to the timestamp and the end being greater than the timestamp or being NULL. Then GROUP BY the timestamps and take the count().

    SELECT gs.ts,
           count(t1.started)
           FROM generate_series('2020-01-03'::timestamp, '2020-01-08'::timestamp, '1 day'::interval) gs (ts)
                LEFT JOIN tab_1 t1
                          ON t1.started <= gs.ts
                             AND (t1.ended IS NULL
                                   OR t1.ended > gs.ts)
           GROUP BY gs.ts
           ORDER BY gs.ts;
    

    db<>fiddle