Search code examples
postgresqljointree

Recursive CTE and multiple inserts in joined table


I'm searching to copy nodes of a hierarchical tree and to apply the changes onto a joined table. I found parts of the answer in other questions like Postgresql copy data within the tree table for the tree copy (in my case I only copy the children and not the root) and PostgreSQL - Insert data into multiple tables simultaneously to insert data in several table simultaneously, but I don't manage to mix them.

I would like to:

  1. Generate the new nodes id from the fields table
  2. Insert the new field ids in the data_versions table
  3. Insert the new nodes in the fields table with the data_id from the data_versions table

Note: there is a circular reference between the fields and the data_versions tables.

See below the schema:

Database schema

Here is a working query, but without the insert in the data_versions table. It is only a shallow copy (keeping the same data_id) while I would like a deep copy:

WITH created_data AS (
    WITH RECURSIVE cte AS (
        SELECT *, nextval('fields_id_seq') new_id FROM fields WHERE parent_id = :source_field_id
        UNION ALL
        SELECT fields.*, nextval('fields_id_seq') new_id FROM cte JOIN fields ON cte.id = fields.parent_id
    )
    SELECT C1.new_id, C1.name, C1.field_type, C1.data_id, C2.new_id new_parent_id
    FROM cte C1 LEFT JOIN cte C2 ON C1.parent_id = C2.id
)

INSERT INTO fields (id, name, parent_id, field_type, data_id)
SELECT new_id, name, COALESCE(new_parent_id, :target_field_id), field_type, data_id FROM created_data
RETURNING id, name, parent_id, field_type, data_id;

And here is the draft query I'm working on for inserting data in the data_versions table resulting with WITH clause containing a data-modifying statement must be at the top level as an error:

WITH created_data AS (
    WITH cloned_fields AS (
        WITH RECURSIVE cte AS (
            SELECT *, nextval('fields_id_seq') new_id FROM fields WHERE parent_id = :source_field_id
            UNION ALL
            SELECT fields.*, nextval('fields_id_seq') new_id FROM cte JOIN fields ON cte.id = fields.parent_id
        )
        SELECT C1.new_id, C1.name, C1.field_type, C1.data_id, C2.new_id new_parent_id
        FROM cte C1 LEFT JOIN cte C2 ON C1.parent_id = C2.id
    ),
    cloned_data AS (
        INSERT INTO data_versions (value, author, field_id)
        SELECT d.value, d.author, c.new_id
        FROM cloned_fields c
        INNER JOIN data_versions d ON c.data_id = d.id
        RETURNING id data_id
    )

    SELECT cloned_fields.new_id, cloned_fields.name, cloned_fields.field_type, cloned_fields.new_parent_id, cloned_data.data_id
    FROM cloned_fields
    INNER JOIN cloned_data ON cloned_fields.data_id = cloned_data.id
)

INSERT INTO fields (id, name, parent_id, field_type, data_id)
SELECT new_id, name, COALESCE(new_parent_id, :target_field_id), field_type, data_id FROM created_data
RETURNING id, name, parent_id, field_type, data_id, value data;


Solution

  • If other people were encountering the same issue as me, I came up with this solution some months later. The trick was to move the data-modifying CTE at the top level as suggested by the error message. We can always access previously declared CTE's:

        WITH new_fields_ids AS (
            WITH RECURSIVE cte AS (
                SELECT *, nextval('fields_id_seq') new_id FROM fields WHERE parent_id = :source_field_id
                UNION ALL
                SELECT fields.*, nextval('fields_id_seq') new_id FROM cte JOIN fields ON cte.id = fields.parent_id
            )
            SELECT C1.new_id, C1.name, C1.field_type, C1.data_id, C2.new_id new_parent_id
            FROM cte C1 LEFT JOIN cte C2 ON C1.parent_id = C2.id
        ),
        cloned_data AS (
            INSERT INTO data_versions (value, author, field_id)
            SELECT d.value, d.author, c.new_id
            FROM new_fields_ids c
            INNER JOIN data_versions d ON c.data_id = d.id
            RETURNING id AS data_id, field_id, value
        ),
        created_data AS (
            SELECT new_fields_ids.new_id, new_fields_ids.name, new_fields_ids.field_type, new_fields_ids.new_parent_id, cloned_data.data_id
            FROM new_fields_ids
            INNER JOIN cloned_data ON new_fields_ids.new_id = cloned_data.field_id
        ),
        cloned_fields AS (
            INSERT INTO fields (id, name, parent_id, field_type, data_id)
            SELECT new_id, name, COALESCE(new_parent_id, :target_field_id), field_type, data_id FROM created_data
            RETURNING id, name, parent_id, field_type, data_id
        )
    
    
        SELECT f.id, f.name, f.parent_id, f.field_type, f.data_id, d.value AS data FROM cloned_fields f
        INNER JOIN cloned_data d ON f.id = d.field_id;