Search code examples
sqlwhere-clausegreatest-n-per-grouparcgis

Top-n in primitive database


I use a GIS database called a File Geodatabase that has minimal SQL support.

Does not support:

  • Window Functions
  • TOP, LIMIT, FETCH FIRST or ROWNUM
  • Correlated subqueries

Does support:


Data:

I have a CITIES table.

If SQL clients were supported, the INSERT statements would look like this:

  create table cities 
   (objectid int, 
    population_centre varchar2(255), 
    population_2021 number(38,1), 
    other_columns varchar2(255)
   ) 
insert into cities (objectid,population_centre,population_2021,other_columns) values (11,'Calgary',1305550,'a');
insert into cities (objectid,population_centre,population_2021,other_columns) values (23,'Edmonton',1151635,'b');
insert into cities (objectid,population_centre,population_2021,other_columns) values (31,'Halifax',348634,'c');
insert into cities (objectid,population_centre,population_2021,other_columns) values (32,'Hamilton',729560,'d');
insert into cities (objectid,population_centre,population_2021,other_columns) values (37,'Kelowna',181380,'e');
insert into cities (objectid,population_centre,population_2021,other_columns) values (40,'Kitchener',522888,'f');
insert into cities (objectid,population_centre,population_2021,other_columns) values (45,'London',423369,'g');
insert into cities (objectid,population_centre,population_2021,other_columns) values (51,'Montreal',3675219,'h');
insert into cities (objectid,population_centre,population_2021,other_columns) values (58,'Oshawa',335949,'i');
insert into cities (objectid,population_centre,population_2021,other_columns) values (59,'Ottawa–Gatineau',1068821,'j');
insert into cities (objectid,population_centre,population_2021,other_columns) values (65,'Quebec City',733156,'k');
insert into cities (objectid,population_centre,population_2021,other_columns) values (67,'Regina',224996,'l');
insert into cities (objectid,population_centre,population_2021,other_columns) values (76,'Saskatoon',264637,'m');
insert into cities (objectid,population_centre,population_2021,other_columns) values (82,'St. Catharines – Niagara Falls',242460,'n');
insert into cities (objectid,population_centre,population_2021,other_columns) values (83,'St. John''s',185565,'o');
insert into cities (objectid,population_centre,population_2021,other_columns) values (90,'Toronto',5647656,'p');
insert into cities (objectid,population_centre,population_2021,other_columns) values (92,'Vancouver',2426160,'q');
insert into cities (objectid,population_centre,population_2021,other_columns) values (94,'Victoria',363222,'r');
insert into cities (objectid,population_centre,population_2021,other_columns) values (98,'Windsor',306519,'s');
insert into cities (objectid,population_centre,population_2021,other_columns) values (99,'Winnipeg',758515,'t');

select * from cities

  OBJECTID POPULATION_CENTRE POPULATION_2021 OTHER_COLUMNS
---------- --------------------------------- -------------
        11 Calgary                 1305550   a              
        23 Edmonton                1151635   b              
        31 Halifax                  348634   c              
        32 Hamilton                 729560   d              
        37 Kelowna                  181380   e              
        40 Kitchener                522888   f              
        45 London                   423369   g              
        51 Montreal                3675219   h              
        58 Oshawa                   335949   i              
        59 Ottawa–Gatineau         1068821   j              
        65 Quebec City              733156   k              
        67 Regina                   224996   l              
        76 Saskatoon                264637   m              
        82 St. Catharines           242460   n              
        83 St. John's               185565   o              
        90 Toronto                 5647656   p              
        92 Vancouver               2426160   q              
        94 Victoria                 363222   r              
        98 Windsor                  306519   s              
        99 Winnipeg                 758515   t              

Fake db<>fiddle


Question:

In the GIS software, I can write SQL expressions in a WHERE clause window to select rows.

enter image description here

Is there a way to select the cities that have the 4 highest populations, despite the limited SQL support?

I suspect the answer is no. But I can't help but wonder how it was done back in the early days when databases had limited SQL functionality. For example, when SQLite was first released, if it had limited SQL support, how were top-n queries performed?


Solution

  • This is a long shot, but works in the db<>fiddle tool. It's kind of brute force elimination and probably won't work right if there are rows with exact same value in population_2021 field. It relies on scalar subquery support, but I am not sure how much nesting the query engine allows, so give it a try:

    select *
    from cities
    where population_2021 = (select max(population_2021) from cities)
    or population_2021 = (select max(population_2021) from cities where population_2021 < (select max(population_2021) from cities))
    or population_2021 = (select max(population_2021) from cities where population_2021 < (select max(population_2021) from cities where population_2021 < (select max(population_2021) from cities)))
    or population_2021 = (select max(population_2021) from cities where population_2021 < (select max(population_2021) from cities where population_2021 < (select max(population_2021) from cities where population_2021 < (select max(population_2021) from cities))))
    order by population_2021 desc