Let's say I have 2 tables (Postgres if it matters):
CREATE TABLE items
(
item_id bigint BIGSERIAL,
description character varying(2000) NOT NULL,
date TIMESTAMP default now(),
-- (many more columns)
CONSTRAINT "item_pk" PRIMARY KEY ("item_id")
);
CREATE TABLE item_updates
(
update_id bigint BIGSERIAL,
item_id_fk bigint NOT NULL,
date TIMESTAMP default now(),
-- (more columns)
description character varying(2000) NOT NULL,
CONSTRAINT "item_updates_pk"
PRIMARY KEY ("update_id"),
CONSTRAINT "item_updates_item_fk"
FOREIGN KEY (item_id_fk) REFERENCES items(item_id) NOT DEFERRABLE,
);
When I insert a new item in table items the description is set. When the description changes, an update is inserted in table item_updates that references the item_id and with that the description is changed and history preserved.
I know how to find the latest description for a given item_id
in table item_updates
, but where I stuck is the following; a query that gives me the latest description for an item.
For an item_id
that has no updates in table item_updates
, it should show the description and (many more columns)
from table items
.
For an item that has updated in table item_updates
, it should show the latest description from table item_updates
including (more columns)
and the corresponding (many more columns)
from table items
.
The resulting query should contain all item_id
entries (+ columns) and corresponding latest item_updates
columns.
What I have now is this:
SELECT
COALESCE(i.description) AS desccription, *
FROM
items i
LEFT JOIN
(SELECT
itm.item_id, MAX(upd.item_id) AS uid
FROM
items itm
LEFT JOIN
item_updates upd ON itm.item_id = upd.item_id_fk
GROUP BY
itm.item_id) AS sub ON sub.item_id = i.item_id;
The COALESCE
is there because tried COALESCE(sub.description, i.description)
but that was not allowed. The result of this query is incorrect however. The error says it can 'see' table sub in the first SELECT
statement.
I see many questions and answers when all info is in the same table, but I could not find (or did not have the right search terms) this particular scenario. The answer that CoPilot gave me was also wrong.
CREATE TABLE items (
item_id bigserial PRIMARY KEY
, inserted_at timestamp NOT NULL DEFAULT now()
, description varchar(2000) NOT NULL
-- , many more columns
);
CREATE TABLE item_updates (
update_id bigserial PRIMARY KEY
, item_id_fk bigint NOT NULL
, updated_at timestamp NOT NULL DEFAULT now()
, description varchar(2000) NOT NULL
-- , more columns
, CONSTRAINT item_updates_item_fk FOREIGN KEY (item_id_fk) REFERENCES items
);
Fixed a couple of syntax issues.
And I don't use the name "date" for a timestamp
column, that would be self-sabotage. Using inserted_at
and updated_at
instead. And more often than not, you really want timestamptz
, not timestamp
. See:
All items:
SELECT i.item_id, COALESCE(u.description, i.description) AS description
-- , i.column1, u.column2 -- whatever
FROM items i
LEFT JOIN (
SELECT DISTINCT ON (u.item_id_fk)
* -- or whatever you want
FROM item_updates u
ORDER BY u.item_id_fk, u.update_at DESC
) u ON u.item_id_fk = i.item_id;
Given item_id
:
SELECT i.item_id, COALESCE(u.description, i.description) AS description
-- , i.column1, u.column2 -- whatever
FROM items i
LEFT JOIN LATERAL (
SELECT * -- or whatever you want
FROM item_updates u
WHERE u.item_id_fk = i.item_id
ORDER BY u.update_at DESC
LIMIT 1
) u ON true
WHERE i.item_id = 123; -- your input here
fiddle
Related:
About DISTINCT ON
:
And how to optimize performance for certain data distributions:
About LATERAL
subqueries: