I've got two sets of data in postgres, the first (mistakenly) uses a bigserial as the key:
CREATE TABLE info_tbl (
id bigserial NOT NULL,
"uuid" uuid NOT NULL,
"time_stamp" timestamp NOT NULL,
"json_thing" jsonb NOT NULL,
CONSTRAINT info_tbl_pkey PRIMARY KEY (id)
);
The correct table is generated as below (has the uuid as the primary key):
CREATE TABLE info_tbl_correct (
"uuid" uuid NOT NULL,
"time_stamp" timestamp NOT NULL,
"json_thing" jsonb NOT NULL,
CONSTRAINT info_tbl_correct_pkey PRIMARY KEY (uuid)
);
I've been trying to generate the posgres code to copy the old data from info_tbl
into info_tbl_correct
, however I'm coming unstuck as when it detects a conflict I want to only copy across data that has a newer timestamp.
INSERT INTO info_tbl_correct
("uuid", "time_stamp", "json_thing")
SELECT src."uuid", src."time_stamp", src."json_thing"
FROM info_tbl AS src
ON CONFLICT ("uuid")
DO UPDATE
SET "time_stamp"=EXCLUDED."time_stamp", "json_thing"=EXCLUDED."json_thing";
Can I get postgres to support two conflict conditions, or can I sort the data as it is being read from info_tbl
? (This would mean that the data will always be newer as it gets read from old->new).
I am not psotgre expert. However,you can try this. You can achieve this by using a combination of ROW_NUMBER() and PARTITION BY to sort the data from info_tbl before inserting it into info_tbl_correct. This way, you can ensure that the newer data is inserted, even if there are conflicts.
Here's the SQL query that you can use:
WITH cte AS (
SELECT
"uuid",
"time_stamp",
"json_thing",
ROW_NUMBER() OVER (PARTITION BY "uuid" ORDER BY "time_stamp" DESC) AS rn
FROM info_tbl
)
INSERT INTO info_tbl_correct ("uuid", "time_stamp", "json_thing")
SELECT "uuid", "time_stamp", "json_thing"
FROM cte
WHERE rn = 1
ON CONFLICT ("uuid") DO NOTHING;