Search code examples
sqljoingroup-bypostgresql-13

Optimal SQL to perform multiple aggregate functions with different group by fields


To simplify a complex query I am working on, I feel like solving this is key.

I have the following table

id city Item
1 chicago 1
2 chicago 2
3 chicago 1
4 cedar 2
5 cedar 1
6 cedar 2
7 detroit 1

I am trying to find the ratio of number of rows grouped by city and item to the number of rows grouped by just the items for each and every unique city-item pair.

So I would like something like this

City Item groupCityItemCount groupItemCount Ratio
chicago 1 2 4 2/4
chicago 2 1 3 1/3
cedar 1 1 4 1/4
cedar 2 2 3 2/3
detroit 1 1 4 1/4

This is my current solution but its too slow.

Select city, item, (count(*) / (select count(*) from records t2 where t1.item=t2.item)) AS pen_ratio
From records t1
Group By city, item

Also replaced where with groupBy and having but that is also slow.

Select city, item, (count(*) / (select count(*) from records t2 group by item having t1.item=t2.item)) AS pen_ratio
From records t1
Group By city, item

(Note: I have removed column3 and column4 from the solution for smaller code)

(Edit: Typo as pointed out by xQbert and MatBailie)


Solution

  • Is it slow because it's evaluating each row separately with the subquery in the select statement? It may be operating as a correlated subquery.

    If that's the case it might be faster if you get the values out of a join and go from there -

    Select city, t1.item, (COUNT(t1.item) / MAX(t2.it_count)) AS pen_ratio
    from records t1
    JOIN (SELECT item, count(item) AS it_count
          FROM records
          group by item) t2
            ON t2.item = t1.item
    GROUP BY city, t1.item
    

    Updated some errors and included the fiddle based off the starting point from xQbert. I had to CAST as float in the fiddle, but you may not need to CAST and use the above query in yours depending on datatypes.

    I believe this follows the intent of your original query.

    https://dbfiddle.uk/?rdbms=postgres_13&fiddle=d77a715175159304b9192a16ad903347