Search code examples
sqlpostgresqlselecttimerelational-division

Select a row with multiple time ranges


I have this table:

CREATE TABLE logins(
    id SERIAL NOT NULL PRIMARY KEY,
    login_time TSRANGE NOT NULL,
    user_id INTEGER NOT NULL REFERENCES users(id),
    CONSTRAINT overlapping_timeslots EXCLUDE USING GIST (
        user_id WITH =,
        timeslot WITH &&
    )
);

When a user logs in the login_time saved with tsrange(login_time,logout_time).
Now I try to search for a user who logs in at:

-- ('2013-12-31 16:40:05','2013-12-31 17:40:05')
-- ('2014-01-04 14:27:45','2014-01-04 17:30:56')
-- ('2014-01-05 14:59:55','2014-01-05 16:03:39')
-- ('2014-01-01 17:20:54','2014-01-01 22:50:57')
-- Not logged in at ('2013-12-31 18:40:05','2014-01-01 01:20:05')

I have this query but with no useful result

SELECT user_id FROM (


select * from logins 
    where user_id in(select user_id from timed_requests where timeslot && tsrange('2013-12-31 16:20:05','2013-12-31 17:40:05'))
    and user_id in(select user_id from timed_requests where timeslot && tsrange('2014-01-04 14:30:45','2014-01-04 17:20:56'))
    and user_id in(select user_id from timed_requests where timeslot && tsrange('2014-01-05 15:09:55','2014-01-05 16:00:39'))
    and user_id in(select user_id from timed_requests where timeslot && tsrange('2014-01-01 17:20:54','2014-01-01 22:50:57')
    and user_id not in(select user_id from timed_requests where timeslot && tsrange('2013-12-31 18:40:05','2014-01-01 01:20:05'))
    ) ss
GROUP BY user_id
order by user_id;

Does anyone know how I can write a query who searches for a user who logs in at 3-4 given timepoints.


Solution

  • This is a typical case of relational division. There are many ways to solve it. This should be among the fastest and simplest:

    SELECT DISTINCT user_id
    FROM   logins l1
    JOIN   logins l2 USING (user_id)
    JOIN   logins l3 USING (user_id)
    JOIN   logins l4 USING (user_id)
    LEFT   JOIN logins l5 ON t5.user_id = t1.user_id AND 
      NOT (l4.timeslot && tsrange('2013-12-31 18:40:05','2014-01-01 01:20:05'))
    WHERE  l1.timeslot && tsrange('2013-12-31 16:20:05','2013-12-31 17:40:05')
    AND    l2.timeslot && tsrange('2014-01-04 14:30:45','2014-01-04 17:20:56')
    AND    l3.timeslot && tsrange('2014-01-05 15:09:55','2014-01-05 16:00:39')
    AND    l4.timeslot && tsrange('2014-01-01 17:20:54','2014-01-01 22:50:57')
    AND    l5.user_id IS NULL
    ORDER  BY 1;
    

    You have an exclusion constraint in place, but the same could be logged in multiple times during a single test range, so we need GROUP BY or DISTINCT.

    We have assembled a whole arsenal of techniques in this related answer:
    How to filter SQL results in a has-many-through relation

    To avoid duplicates to begin with and at the same time retrieve a whole row from a users table (which is not in your question, but probably exists), this form might be faster:

    SELECT *
    FROM   users u
    WHERE  EXISTS (SELECT 1 FROM logins WHERE user_id = u.user_id
           AND timeslot && tsrange('2013-12-31 16:20:05','2013-12-31 17:40:05'))
    AND    EXISTS (SELECT 1 FROM logins WHERE user_id = u.user_id
           AND timeslot && tsrange('2014-01-04 14:30:45','2014-01-04 17:20:56'))
    AND    EXISTS (SELECT 1 FROM logins WHERE user_id = u.user_id
           AND timeslot && tsrange('2014-01-05 15:09:55','2014-01-05 16:00:39'))
    AND    EXISTS (SELECT 1 FROM logins WHERE user_id = u.user_id
           AND timeslot && tsrange('2014-01-01 17:20:54','2014-01-01 22:50:57'))
    AND    NOT EXISTS (SELECT 1 FROM logins WHERE user_id = u.user_id
           AND timeslot && tsrange('2013-12-31 18:40:05','2014-01-01 01:20:05'))
    ORDER  BY u.user_id;
    

    The exclusion constraint on logins is instrumental for these queries. It is implemented by a multicolumn GiST index that makes these look-ups very fast.