I need to LEFT JOIN a table with only the lowest values in the right table that are greater than other values in the left table.
I can do this with ROW_NUMBER and Subqueries but the issue is that I need to do the same with about 5 Actions from 5 different tables and I dont want a 10-page long query that is difficult for my coworkers to audit.
I have Table.Users that measures when a user does an action. I want to left join it with Table.Action1 to gather the first incident of action1 that occurs after the action in Table.Users.
Users can appear multiple times in both tables.
Is there any way to gather only one row each from multiple LEFT JOINs without multiple subqueries?
I cant use aggregations as there is a lot of info I need from the action tables that is connected to the specific actions.
My data would look like: Table.Users:
| User_id | Date |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
| 1 | 5 |
| 1 | 6 |
Table.Action1
| User_id | Date |Action.id|
| 1 | 1 |1|
| 1 | 2 |2|
| 1 | 5 |3|
| 1 | 6 |4|
Ideal Result: Table.Users:
| User_id | Date |Action1.Date| Action1.id
| 1 | 1 |2 |2|
| 1 | 2 |5 |3|
| 1 | 3 |5 |3|
| 1 | 4 |5 |3|
| 1 | 5 |6 |4|
| 1 | 6 | | |
I have solved this before with multiple subqueries and ROW_NUMBER but I have many more tables that need joining now and dont want a very long query that is difficult for a co-worker to understand.
Assuming no duplicate (user_id, dt)
in the actions
table, as shown in your sample data, one option uses a left join
to bring the matching action (if any), and then window function last_value()
with option ignore nulls
to retrieve the relevant action date and id. The trick is to carefully adjust the window clause:
select u.*,
last_value(a.dt ignore nulls) over w last_action_dt,
last_value(a.id ignore nulls) over w last_action_id
from users u
left join actions a on a.user_id = u.user_id and a.dt = u.dt
window w as (
partition by u.user_id
order by u.dt desc
rows between unbounded preceding and 1 preceding
)
order by u.user_id, u.dt
Side note: I renamed column date
to dt
in the query, so it does not clash with the corresponding SQL keyword.