Search code examples
sqlpostgresqlupsert

Use multiple conflict_target in ON CONFLICT clause


I have two separately unique columns in a table: col1, col2. Both have a unique index (col1 is unique and so is col2).

I need INSERT ... ON CONFLICT ... DO UPDATE syntax, and update other columns in case of a conflict, but I can't use both columns as conflict_target.

It works:

INSERT INTO table
...
ON CONFLICT ( col1 ) 
DO UPDATE 
SET 
-- update needed columns here

But how to do this for several columns, something like this:

...
ON CONFLICT ( col1, col2 )
DO UPDATE 
SET 
....

Currently using Postgres 9.5.


Solution

  • A sample table and data

    CREATE TABLE dupes(col1 int primary key, col2 int, col3 text,
       CONSTRAINT col2_unique UNIQUE (col2)
    );
    
    INSERT INTO dupes values(1,1,'a'),(2,2,'b');
    

    Reproducing the problem

    INSERT INTO dupes values(3,2,'c')
    ON CONFLICT (col1) DO UPDATE SET col3 = 'c', col2 = 2
    

    Let's call this Q1. The result is

    ERROR:  duplicate key value violates unique constraint "col2_unique"
    DETAIL:  Key (col2)=(2) already exists.
    

    What the documentation says

    conflict_target can perform unique index inference. When performing inference, it consists of one or more index_column_name columns and/or index_expression expressions, and an optional index_predicate. All table_name unique indexes that, without regard to order, contain exactly the conflict_target-specified columns/expressions are inferred (chosen) as arbiter indexes. If an index_predicate is specified, it must, as a further requirement for inference, satisfy arbiter indexes.

    This gives the impression that the following query should work, but it does not because it would actually require a together unique index on col1 and col2. However such an index would not guarantee that col1 and col2 would be unique individually which is one of the OP's requirements.

    INSERT INTO dupes values(3,2,'c') 
    ON CONFLICT (col1,col2) DO UPDATE SET col3 = 'c', col2 = 2
    

    Let's call this query Q2 (this fails with a syntax error)

    Why?

    Postgresql behaves this way is because what should happen when a conflict occurs on the second column is not well defined. There are number of possibilities. For example in the above Q1 query, should postgresql update col1 when there is a conflict on col2? But what if that leads to another conflict on col1? how is postgresql expected to handle that?

    A solution

    A solution is to combine ON CONFLICT with old fashioned UPSERT.

    CREATE OR REPLACE FUNCTION merge_db(key1 INT, key2 INT, data TEXT) RETURNS VOID AS
    $$
    BEGIN
        LOOP
            -- first try to update the key
            UPDATE dupes SET col3 = data WHERE col1 = key1 and col2 = key2;
            IF found THEN
                RETURN;
            END IF;
    
            -- not there, so try to insert the key
            -- if someone else inserts the same key concurrently, or key2
            -- already exists in col2,
            -- we could get a unique-key failure
            BEGIN
                INSERT INTO dupes VALUES (key1, key2, data) ON CONFLICT (col1) DO UPDATE SET col3 = data;
                RETURN;
            EXCEPTION WHEN unique_violation THEN
                BEGIN
                    INSERT INTO dupes VALUES (key1, key2, data) ON CONFLICT (col2) DO UPDATE SET col3 = data;
                    RETURN;
                EXCEPTION WHEN unique_violation THEN
                    -- Do nothing, and loop to try the UPDATE again.
                END;
            END;
        END LOOP;
    END;
    $$
    LANGUAGE plpgsql;
    

    You would need to modify the logic of this stored function so that it updates the columns exactly the way you want it to. Invoke it like

    SELECT merge_db(3,2,'c');
    SELECT merge_db(1,2,'d');