Search code examples
postgresqlinsert-selectsql-returning

PostgreSQL: copy the table rows and get the old-new matching information


Here we have a certain table:

CREATE TABLE mytbl (
  id int PRIMARY KEY generated by default as identity,
  col1 int,
  col2 text, ...
);

We need to copy part of the rows of the table and get information about the correspondence of the old and new IDs (in order to then copy the related data from other tables).

I tried such a query:

insert into mytbl (col1, col2)
    select col1, col2
        from mytbl old
        where col1 = 999 -- some condition
    returning 
        old.id as old_id,
        id as new_id;

But this query produces the error: ERROR: missing FROM-clause entry for table "old"

Is there a way to write a similar query without an error?

What I've already thought about:

  • copy one line at a time in a loop — apparently I will have to, unless an easier way is found.
  • alter table mytbl add column old_id (foreign key references mytbl) and then insert into mytbl (old_id, col1, col2) select id, col1, col2 ... returning id as new_id, old_id — but in reality, we do not need to store this information, because it is needed only at the moment of creating a copy.
  • INSERT SELECT without RETURNING; and then SELECT old.id as old_id, new.id as new_id FROM mytbl new JOIN mytbl old ON (some connection condition by intelligent key) — is quite difficult in my case.

Solution

  • You can (but probably shouldn't) rely on the order of inserted rows to match between the selection and the RETURNING clause:

    WITH selection AS (
        SELECT id, col1, col2, row_number() OVER (ORDER BY id)
        FROM mytbl
        WHERE col1 = 999 -- some condition
        ORDER BY id
    ), inserted AS (
        INSERT INTO mytbl (col1, col2)
        SELECT col1, col2
        FROM selection
        ORDER BY selection.id
        RETURNING id
    )
    SELECT s.id AS old_id, ins.id AS new_id
    FROM (SELECT inserted.id, row_number() OVER (ORDER BY inserted.id) FROM inserted) AS ins
    JOIN selection USING (row_number);
    

    A probably better approach (which also works nicely when you need to copy in multiple mutually-dependant tables) is to generate the new ids ahead of the insert, using the sequence that is underlying the identity column:

    WITH selection AS (
        SELECT nextval(pg_get_serial_sequence('mytbl', 'id')) AS new_id, id AS old_id, col1, col2
        FROM mytbl
        WHERE col1 = 999 -- some condition
    ), inserted AS (
        INSERT INTO mytbl(id, col1, col2)
        OVERRIDING SYSTEM VALUE -- necessary when using identity columns!
        SELECT new_id, col1, col2
        FROM selection
    )
    SELECT old_id, new_id
    FROM selection;
    

    This does however require USAGE permission on the underlying sequence, which may not be granted by default even if a role can normally use the identity column.