sql postgresql random greatest-n-per-group

PostgreSQL. Select a column that correlates with value in the aggregate function

Here is the 'items' table, containing more than 10 rows:

+-----+-----------+-----------+----------+
| id  | item_name | category  | quantity |
+=====+===========+===========+==========+
| 3   | item33    | category1 | 5        |
+-----+-----------+-----------+----------+
| 2   | item52    | category5 | 1        |
+-----+-----------+-----------+----------+
| 1   | item46    | category1 | 3        |
+-----+-----------+-----------+----------+
| 4   | item11    | category3 | 2        |
+-----+-----------+-----------+----------+
| ... | ...       | ...       | ...      |
+-----+-----------+-----------+----------+

Values in the 'items' column are unique, the ones in the 'category' columnt - aren't unique.

The task is:

Remove duplicates of categories: if a category contains more than 1 item, take the row with minimal 'id'.
Order results by the 'quantity' (ASC).
Take 10 rows: top 5 and random 5 from the rest result data output.

So, the ordering table (after #2 sub-task) should look like that:

+-----+-----------+-----------+----------+
| id  | item_name | category  | quantity |
+=====+===========+===========+==========+
| 2   | item52    | category5 | 1        |
+-----+-----------+-----------+----------+
| 4   | item11    | category3 | 2        |
+-----+-----------+-----------+----------+
| 1   | item46    | category1 | 3        |
+-----+-----------+-----------+----------+
| ... | ...       | ...       | ...      |
+-----+-----------+-----------+----------+

I know how to exclude duplicates for categories:

SELECT min(id) as id, category
FROM items
GROUP BY category

But I don't know how to order it by the quantity. If I try to add 'quantity' to the 'select' line and then make 'ORDER BY quantity', I get the error: "column "quantity" must appear in the GROUP BY clause or be used in an aggregate function".

If there is a way to add this 'quantity' column to the data output (the value in this column should correlate with the resulting 'id' value (i.e. "min(id)"))? And then do ordering and picking rows...

Solution

Consider joining back your aggregate query to the unit level data for all columns including quantity:

SELECT i.id, i.item_name, i.category, i.quantity
FROM items i
INNER JOIN 
  (SELECT category, min(id) AS min_id
   FROM items
   GROUP BY category) agg
 ON i.id = agg.min_id
 AND i.category = agg.category
ORDER BY i.quantity

For top 5 and random 5 split, integrate a union with CTE to hold the resultset:

WITH sub AS (
  SELECT i.id, i.item_name, i.category, i.quantity
  FROM items i
  INNER JOIN 
    (SELECT category, min(id) AS min_id
     FROM items
     GROUP BY category) agg
   ON i.id = agg.min_id
   AND i.category = agg.category
)

-- TOP 5 ROWS
SELECT id, item_name, category, quantity
FROM sub
ORDER BY i.quantity
LIMIT 5

UNION

-- RANDOM ROWS OF NON-TOP 5
SELECT id, item_name, category, quantity
FROM 
  (SELECT id, item_name, category, quantity
   FROM sub
   ORDER BY i.quantity
   OFFSET 5) below5
ORDER BY random()
LIMIT 5