To simplify a complex query I am working on, I feel like solving this is key.
I have the following table
id | city | Item |
---|---|---|
1 | chicago | 1 |
2 | chicago | 2 |
3 | chicago | 1 |
4 | cedar | 2 |
5 | cedar | 1 |
6 | cedar | 2 |
7 | detroit | 1 |
I am trying to find the ratio of number of rows grouped by city and item to the number of rows grouped by just the items for each and every unique city-item pair.
So I would like something like this
City | Item | groupCityItemCount | groupItemCount | Ratio |
---|---|---|---|---|
chicago | 1 | 2 | 4 | 2/4 |
chicago | 2 | 1 | 3 | 1/3 |
cedar | 1 | 1 | 4 | 1/4 |
cedar | 2 | 2 | 3 | 2/3 |
detroit | 1 | 1 | 4 | 1/4 |
This is my current solution but its too slow.
Select city, item, (count(*) / (select count(*) from records t2 where t1.item=t2.item)) AS pen_ratio
From records t1
Group By city, item
Also replaced where with groupBy and having but that is also slow.
Select city, item, (count(*) / (select count(*) from records t2 group by item having t1.item=t2.item)) AS pen_ratio
From records t1
Group By city, item
(Note: I have removed column3 and column4 from the solution for smaller code)
(Edit: Typo as pointed out by xQbert and MatBailie)
Is it slow because it's evaluating each row separately with the subquery in the select statement? It may be operating as a correlated subquery.
If that's the case it might be faster if you get the values out of a join and go from there -
Select city, t1.item, (COUNT(t1.item) / MAX(t2.it_count)) AS pen_ratio
from records t1
JOIN (SELECT item, count(item) AS it_count
FROM records
group by item) t2
ON t2.item = t1.item
GROUP BY city, t1.item
Updated some errors and included the fiddle based off the starting point from xQbert. I had to CAST as float in the fiddle, but you may not need to CAST and use the above query in yours depending on datatypes.
I believe this follows the intent of your original query.
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=d77a715175159304b9192a16ad903347