Search code examples
sqlaggregateamazon-redshiftanalytics

How to write an SQL query joining on datetime for each row


I have the following data of data with news (Each article with a publication date and time) and views (Recording each view at a particular timestamp):

news

id, published_timestamp
1, 2018-11-16 13:30:24
2, 2018-12-03 13:59:56
...
views

view_id, post_id, viewed_timestamp
a, 1, 2018-11-16 13:30:24
b, 1, 2018-11-16 13:30:25
c, 1, 2018-11-16 13:30:26
d, 2, 2018-12-03 13:59:56
e, 2, 2018-12-03 13:59:57
f, 2, 2018-12-05 13:59:57
...

Getting lifetime views of each news article would be simply:

SELECT 
post_id, count(view_id) 
FROM views 
GROUP BY post_id

How do I write an SQL query (I'm using AWS Redshift) that can join the views relative to the published datetime? For example, I want to get only views for each article within 1 day of its posting.


Solution

  • You need something like this

    select n.id, n.published_timestamp,
           count(v.post_id) as views__within_1_day_of_post
    from news as n
    left join views as v on n.id=v.post_id 
                        and v.viewed_timestamp between n.published_timestamp and n.published_timestamp + interval '1 day'
    group by 1, 2