Search code examples
postgresqlgreatest-n-per-group

Get last row from group, limit number of results in PostgreSQL


I have a table with records representing a log, I omit rest of the columns in this example.

The id-column is autoincrement, item_id represents an item in app.

I need to get the latest item_id, for example two or three

CREATE TABLE "log" (
    "id" INT,
    "item_id" INT
);

-- TRUNCATE TABLE "log";
INSERT INTO "log" ("id", "item_id") VALUES
    (1, 1),
    (2, 2),
    (3, 1),
    (4, 1),
    (5, 3),
    (6, 3);

Basic query will list all results, latest at the top:

SELECT *
FROM "log"
ORDER BY "id" DESC
id  item_id
6   3
5   3
4   1
3   1
2   2
1   1

I would like to have just two (LIMIT 2) last item_ids with their id. Last means - inserted last (ORDER BY id).

id  item_id
6   3
4   1

Last three would be

id  item_id
6   3
4   1
2   2

Once an item_id is returned, it is not returned again. So LIMIT 4 would return only three rows because there are only three unique item_id.

I am probably missing something. I already tried various combinations of DISTINCT OF, GROUP BY, LIMIT etc.


UPDATE #1: After I tested query by S-man (below), I found out that it works for the data I provided howerer it does not work in general, for another set of data (sequence of item_id A, B and A again.). Here is another data set:

TRUNCATE TABLE "log";
INSERT INTO "log" ("id", "item_id") VALUES
    (1, 1),
    (2, 2),
    (3, 3),
    (4, 3),
    (5, 1),
    (6, 3);

Data in DB, ordered by id desc:

id  item_id
6   3
5   1
4   3
3   3
2   2
1   1

Expected result for last three item_id

6   3
5   1
2   2

Solution

  • Well, after three changes, now we come back to the very first idea:

    Just take DISTINCT ON:

    demo:db<>fiddle

    SELECT
        *
    FROM (
        SELECT DISTINCT ON (item_id)   -- 1
            *
        FROM log
        ORDER BY item_id, id DESC
    ) s
    ORDER BY id DESC                   -- 2
    LIMIT 2
    
    1. Returns exact one record of an ordered group. You group is the item_id, the order is id DESC, so you get the highest id for each item_id
    2. Reorder by id DESC (instead of the previously ordered item_id) and limit your query output.