Search code examples
postgresqlstored-proceduresbulkinsertinsert-update

PostgreSQL - multiple insert within stored procedure


I have a client-side entity which contains a list of secondary entities, and this is represented on the DB as a foreign key constraint.

The client-side creation form allows to create both the main entity and its sub-entities at once, so on the DB side may have to save both a new main entity and some new sub-entities in one transaction. For this reason, I would like to use a stored procedure DB-side, and the most immediate approach would seem to pass the main entity's fields as arguments, with the sub-entities as a collection argument of some sort (likely a json string, to account for the heterogeneity of the sub-entities' data).

Within my stored procedure, first I would save the main entity, then I would save the sub-entities, possibly making use of a newly-generated main entity ID (as I understand, the procedure itself as a transaction, and a failure after writing the main entity would roll back said insertion/update).

My problem lies within saving the sub-entities (from here on just "entities"). While I understand that, for existing entities, I need to run each update on its own, I would like to use the multi-row insert syntax for new entity. For this reason, I want to build a prepared statement where I can grow the list of inserted tuples as the length of the entity list may vary, as in INSERT INTO ... VALUES <tuple1>, <tuple2>, ..., and here comes the pain...

It would of course be simplest to just concatenate raw values from each entity in order to build the query string, this is inviting SQL injection.

On the other hand, while I may build the insert statement using the $ notation, I would have a problem when executing it, wiring the parameters, that is:

-- prepare insert statement stmt

EXECUTE stmt USING <???> -- variable number of parameters to wire, 
                         -- I don't know how to work around it.

Is there a way to use a single multi-row insert, or should I just fall back to inserting values one-by-one? I assume that a multi-row insert doesn't actually just repeatedly perform single-row inserts, and that there would be some benefit in using it over the one-by-one approach. Of course, I would rather avoid twisted workarounds that would slow the operation down beyond the benefit given by the multi-row insert.


Solution

  • You could package the tuples in a json[] and deparse them in the INSERT statement, but I don't really see the point.

    I would just run single-row INSERT statements for each entity.

    I guess you are worried about performance, but maybe all that parsing and deparsing would outweigh the advantage of inserting everything in a single statement.

    Apart from that, I'd expect the code to cleaner and easier to maintain with simple INSERTs in a loop.