Search code examples
sqlpostgresqlpostgresql-14

Errors attempting to INSERT INTO table with identity column from a staging table


I'm trying to insert data from a staging table into a another similar table with an identity column and cannot get the SQL syntax correct without errors. This is in PostgreSQL 14.

Staging table:

CREATE TABLE IF NOT EXISTS public.productstaging
(
    guid varchar(64) NOT NULL,
    productimagehash_sha2256 varchar(64) NOT NULL,
    productimage Bytea NOT NULL,
    UNIQUE (productimagehash_sha2256)
);

Table to insert into:

CREATE TABLE IF NOT EXISTS public.product
(
    id int NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
    guid varchar(64) NOT NULL,
    productimagehash_sha2256 varchar(64) NOT NULL,
    productimage Bytea NOT NULL
);

Insert query:

-- Insert
INSERT INTO public.product
SELECT

public.productstaging.guid,
public.productstaging.productimagehash_sha2256,
public.productstaging.productimage

FROM public.productstaging
LEFT OUTER JOIN public.product
ON (

public.product.guid = public.productstaging.guid
AND public.product.productimagehash_sha2256 = public.productstaging.productimagehash_sha2256
)
WHERE public.product.guid IS NULL
AND public.product.productimagehash_sha2256 IS NULL;

I get an error

ERROR: column "id" is of type integer but expression is of type character varying

I have tried several things (listed below) on the query but they all give errors. Most of the example when searching insert from a list of fixed values rather than insert from another table, eg ...VALUES(guid, productimagehash_sha2256, productimage).... I can't find anything similar on a search and hoped someone can point me in the right direction?

...
DEFAULT, --ERROR:  DEFAULT is not allowed in this context
public.productstaging.guid,
public.productstaging.productimagehash_sha2256,
public.productstaging.productimage
...

...
0, --ERROR:  cannot insert a non-DEFAULT value into column "id"
public.productstaging.guid,
public.productstaging.productimagehash_sha2256,
public.productstaging.productimage
...

...
null, --ERROR:  cannot insert a non-DEFAULT value into column "id"
public.productstaging.guid,
public.productstaging.productimagehash_sha2256,
public.productstaging.productimage
...

Solution

  • Specify the target columns for the INSERT - something that you should do always.

    INSERT INTO public.product (guid, productimagehash_sha2256, productimage )
    SELECT productstaging.guid,
           productstaging.productimagehash_sha2256,
           productstaging.productimage
    FROM public.productstaging
      LEFT JOIN ...
    

    Apparently you treat the combination of guid, productimagehash_sha2256 as unique. If you create a unique index on those columns:

    create unique index on productstaging (guid, productimagehash_sha2256);
    

    then your INSERT statement gets much simpler:

    INSERT INTO public.product (guid, productimagehash_sha2256, productimage )
    SELECT guid,
           productimagehash_sha2256,
           productimage
    FROM public.productstaging
    ON CONFLICT (guid, productimagehash_sha2256) 
       DO NOTHING;
    

    Note that if guid stores a real UUID, that column should be defined with the type uuid not varchar