Search code examples
sqlcountgroupingwindow-functionsaggregation

Aggregate and count conditionally but return all lines


The table structure looks like this:

+----------+------------+---------------+
| id       | manager_id | restaurant_id |
+----------+------------+---------------+
| 1        | 1          | 1001          |
| 2        | 1          | 1002          | 
| 3        | 2          | 1003          |
| 4        | 2          | 1004          |
| 5        | 2          | 1005          |
| 6        | 3          | 1006          |
+----------+------------+---------------+

I want to retrieve all the restaurant_id aggregated per manager_id. Additionally, I also need to filter per manager's count(restaurant_id): returning only restaurants of managers that have more than one restaurant, and less than 3.

Edit: this is an oversimplified version of the real data, the actual use case must cover more than one to 5 (included).

So that in the end, the result would be

+---------------+------------+
| restaurant_id | manager_id | 
+---------------+------------+
| 1001          | 1          |
| 1002          | 1          |
+---------------+------------+

I tried something similar to:

SELECT 
  restaurant_id, 
  manager_id,
  COUNT(*) AS restaurant_count
FROM 
  Manager_Restaurant
GROUP BY 
  manager_id
HAVING 
  restaurant_count > 1 and 
  restaurant_count < 3;

But this return only one line per manager because of the grouping and I want all the restaurants.


Solution

  • Window (aka analytic) functions were designed with just such a case in mind.

    You can use a window function to assign to each row the counts for a particular grouping, then select the rows having the required counts. Here is an example:

    create table manager_restaurant (id int,manager_id int,restaurant_id int);
    insert into manager_restaurant 
      values (1,1,1001),(2,1,1002),(3,2,1003),(4,2,1004),(5,2,1005),(6,3,1006);
    
    
    select manager_id,restaurant_id from (
      select *,
        count(1) over (partition by manager_id) as n 
      from 
        manager_restaurant
      )x 
     where 
       n>1 and n<3 ;
    
    manager_id restaurant_id
    1 1001
    1 1002