I am attempting to build a query that reduces a GROUP BY group to a single row, including a value for a column based on the max value of another column. In this case, I want an item id, total qty ordered and most-used supplier.
I've successfully built a query that sums the qty ordered and groups by item and supplier, yielding:
| id | qty | supplier |
| 1 | 20 | S&S Activewear |
| 1 | 10 | J&J Textiles |
| 2 | 5 | AB Footwear |
| 2 | 10 | CD Shoes |
and the intended result would be total qty ordered (for all suppliers) and most used supplier, so:
| id | total_qty | most_used_supplier |
| 1 | 30 | S&S Activewear |
| 2 | 15 | CD Shoes |
Conceptually, I imagine doing a subquery, grouping the above results by id alone, then sum(qty) and somehow choose the supplier value by ranking the GROUP BY by qty.
I have read many related posts but I am failing to apply any of those methods successfully to this end, including use of ROW_NUMBER and PARTITION_BY.
I am doing this in Elixir with Ecto on a Postgres DB, but to keep it generalized so anyone can respond, I am just looking to understand how this would be done in SQL. Please let me know if I can provide more detail, thank you.
There are several approaches and it sounds like you've played with this one a bit even:
with data as (
select *,
row_number() over (partition by id order by qty desc) as rn
from T
)
select id, sum(qty) as total_qty,
(select d2.supplier from data d2
where d2.id = d.id and rn = 1) as most_used_supplier
from data d
group by id;