Search code examples
sqlpostgresqlwindow-functions

LIMIT based on COUNT of DISTINCT `foreign_key` so far


I have a union of 2 tables 'sites' and 'trucks' (ordered by distance). The recordset looks like:

enter image description here

I need to fetch all rows until I reach to the specific (n) number of unique company_ids, starting from row 1.

So, if I a get something like the following: enter image description here

Then I can make a simple query such as:

SELECT * FROM union_recordset where distinct_company_id_count_so_far < (3 + 1);
-- where n = 3

and get the desired result of:

enter image description here


Solution

  • If your database supports count(distinct) as a window function:

    select ur.*,
           count(distinct company_id) over (order by distance) as cnt
    from union_recordset ur
    order by distance;
    

    If not, you can count the first occurrence:

    select ur.*,
           sum(case when seqnum = 1 then 1 else 0 end) over (order by distance) as cnt
    from (select ur.*,
                 row_number() over (partition by company_id order by distance) as seqnum
          from union_recordset ur
         ) ur
    order by distance;
    

    In Postgres, the sum() can be simplified to:

           sum( (seqnum = 1)::int ) over (order by distance) as cnt
    

    And then to get the numbers for, say, the first three companies, you want:

    select ur.*
    from (select ur.*,
                 sum( (seqnum = 1)::int ) over (order by distance) as cnt
          from (select ur.*,
                       row_number() over (partition by company_id order by distance) as seqnum
                from union_recordset ur
               ) ur
         ) ur
    where cnt <= 3
    order by distance;