Say I have a table of people who have used my service on a day N and a table describing what theme the users have changed to. There's no table available telling me what theme they are currently using. What I would like to do is to see that on a daily basis.
So let's say the changes table looks like this.
| user_ID | date | theme |
|---------|----------|-------|
| user1 | 1.1.2021 | Dark |
| user1 | 4.1.2021 | Light |
| user2 | 2.1.2021 | Dark |
| user2 | 6.1.2021 | Light |
The activity table has just the user_ID and a date they accessed the service reported.
| user_ID | date |
|---------|----------|
| user1 | 1.1.2021 |
| user1 | 2.1.2021 |
| user1 | 3.1.2021 |
| user1 | 4.1.2021 |
| user1 | 5.1.2021 |
| user1 | 6.1.2021 |
| user2 | 2.1.2021 |
| user2 | 3.1.2021 |
| user2 | 4.1.2021 |
| user2 | 5.1.2021 |
| user2 | 6.1.2021 |
Now what I would like to do is join the first table to the second one so the theme they are using on an active date would be listed there.
| user_ID | date | theme |
|---------|----------|-------|
| user1 | 1.1.2021 | Dark |
| user1 | 2.1.2021 | Dark |
| user1 | 3.1.2021 | Dark |
| user1 | 4.1.2021 | Light |
| user1 | 5.1.2021 | Light |
| user1 | 6.1.2021 | Light |
| user2 | 2.1.2021 | Dark |
| user2 | 3.1.2021 | Dark |
| user2 | 4.1.2021 | Dark |
| user2 | 5.1.2021 | Dark |
| user2 | 6.1.2021 | Light |
How do I achieve this? Assume there can be an unlimited amount of themes.
One method is a correlated subquery, but I'm not sure if Presto supports this:
select a.*,
(select c.theme
from changes c
where c.user_id = a.user_id and
c.date <= a.date
order by c.date desc
limit 1
) as theme
from activity a;
A perhaps more efficient method is to use left join
but to calculate the "end" date for each change:
select a.*, c.theme
from activity a left join
(select c.*,
lead(date) over (partition by user_id order by date) as next_date
from changes c
) c
on a.user_id = c.user_id and
a.date >= c.date and
(a.date < c.next_date or c.next_date is null);