in my project I got 2 tables (postgresql) - partners and campaigns. Each partner may have many campaigns, but each campaign may have only one partner. The data looks like the following:
partners:
id | name
=========
1 Partner_1
2 Partner_2
campaigns:
id | name | partner_id | started_at
=============================================================
1 camp_1 1 2022-07-06 00:00:00.000000 +00:00
2 camp_2 1 2022-07-07 00:00:00.000000 +00:00
3 camp_3 2 2022-08-08 00:00:00.000000 +00:00
4 camp_4 2 2022-08-09 00:00:00.000000 +00:00
The task is for each partner select only 1 campaign with the most recent started_at date. So the final result would look like the following:
partner_name | campaign_started_at
==================================
Partner_1 2022-07-07 00:00:00.000000 +00:00
Partner_2 2022-08-09 00:00:00.000000 +00:00
For these purposes I guess I need to use a window function with select subquery and Group By statements and I tried the following:
SELECT public.partners.name,
public.campaigns.started_at as camp_started_at,
rank() OVER (PARTITION BY public.partners.name ORDER BY public.campaigns.started_at DESC)
FROM public.partners
JOIN campaigns ON campaigns.partner_id = partners.id
GROUP BY partners.name, public.campaigns.started_at;
but it did not produce the desired result. Any ideas how to fix it would be welcome.
You can simply use MAX()
and GROUP BY partners.name
to get your desired output. The window function RANK() OVER()
is expensive and overkill here.
SELECT public.partners.name
, MAX(public.campaigns.started_at) as camp_started_at
FROM public.partners
JOIN campaigns ON campaigns.partner_id = partners.id
GROUP BY partners.name
;