Search code examples
postgresqltimestampsql-timestamp

Why does my timestamp query not return values exactly on the hour?


I have a dataset of documents with associated publication dates (regular Postgres TIMESTAMP. After noticing that some of these data points are for some reason invalid, I would like to remove all documents from a specific outlet that are published in the hour of 10-11 am, on two specific dates.

So far, I have come up with two different queries to do this, but they strangely return different results The first one returns all documents based on the simple DATETIME range query, and has 1603 results.

The second one queries only the days, and then all elements that have DATE_PART('HOUR', published) = 10, which should (in theory) return the exact same.

As the below query shows, though, there are two elements that are incidentally published exactly on the hour. Although they still have the same DATE_PART signature, they seemingly get ignored in the second query.

Can anyone tell me whether this is default behavior, or why this would return different answers?

postgres=# SELECT document_id, published, DATE_PART('HOUR', published) AS hour 
FROM documents 
WHERE (published >= '2016-08-18 10:00:00.000' AND published <= '2016-08-18 10:59:59.999')                                                         
OR (published >= '2016-08-28 10:00:00.000' AND published <= '2016-08-28 10:59:59.999') 
AND feedName = 'WP'
EXCEPT
SELECT document_id, published, DATE_PART('HOUR', published) AS hour 
FROM documents WHERE (to_char(published, 'YYYY-MM-DD') = '2016-08-18' 
OR to_char(published, 'YYYY-MM-DD') = '2016-08-28') 
AND feedName = 'WP' AND DATE_PART('HOUR', published) = 10;

 document_id |      published      | hour 
-------------+---------------------+------
       75676 | 2016-08-18 10:00:00 |   10
       76424 | 2016-08-18 10:00:00 |   10

Solution

  • The problem turned out to be a missing set of brackets around the first query, where the two date ranges should have been connected with the OR operator inside a separate set of brackets, like so:

    SELECT document_id, published, DATE_PART('HOUR', published) AS hour 
    FROM documents 
    WHERE ((published >= '2016-08-18 10:00:00.000' AND published <= '2016-08-18 10:59:59.999')                                                         
    OR (published >= '2016-08-28 10:00:00.000' AND published <= '2016-08-28 10:59:59.999')) 
    AND feedName = 'WP'