Search code examples
sqlpostgresqlgreatest-n-per-group

Select top 3 records from each territory


I am struggling with a PostgreSQL query to select the top 3 records by each category. I have 3 tables (as given below) and data,

geography:

enter image description here

formulary:

enter image description here

formulary_controller:

enter image description here

I want to select the top 3 formularies order by lives desc from each territory under a region.

The query I came up with so far is,

select 
    t.name territory, f.name formulary, sum(f.lives) lives
from formulary f join geography t on f.territory_id = t.id
where t.parent_id = '1'
group by t.name, f.name
order by lives desc, territory;

If I apply limit 3 top above query will give me the top 3 from all territories, but I want to select the top 3 formularies from each territory under the keystone region.

Queries to create a table and load dummy data:

create table formulary_controller
(
    id SERIAL primary key,
    name varchar(300)
);

create table geography
(
    id SERIAL primary key,
    name varchar(250),
    parent_id integer references geography(id)
);

create table formulary
(
    id SERIAL primary key,
    name varchar(300),
    lives integer,
    controller_id integer references formulary_controller(id),
    territory_id integer references geography(id)
);


insert into formulary_controller (id, name)
values(1, 'cont1'), (2, 'cont2');


insert into geography (id, name, parent_id)
values
(1, 'keystone', null),
(2, 'pittsburgh', 1),
(3, 'Baltimore', 1);


insert into formulary
(name, lives, controller_id, territory_id)
values
('PA FRM 1', 200, 1, 2),
('PA FRM 2', 1400, 1, 2),
('PA FRM 3', 1300, 1, 2),
('PA FRM 4', 100, 1, 2),
('PA FRM 5', 2430, 1, 2),
('BA FRM 1', 100, 2, 3),
('BA FRM 2', 2300, 2, 3),
('BA FRM 3', 1200, 2, 3),
('BA FRM 4', 1650, 2, 3),
('BA FRM 5', 1200, 2, 3);

Solution

  • You can try the below - DEMO HERE

    select * from
    (
    select 
        t.name territory, f.name formulary, sum(f.lives) lives,
          row_number() over(partition by t.name order by sum(f.lives) desc) as rn
    from formulary f join geography t on f.territory_id = t.id
    where t.parent_id = '1'
    group by t.name,f.name
    )A where rn<=3