Search code examples
stringpostgresqlparametersstring-interpolationdblink

PostgreSQL DBLink where statement using variable from local database


I have two PostgreSQL databases (one source and one destination) with multiple tables. I managed to transfer data (copy data, or copy subset of data) from source to destination using DBLink.

My question is on how I could make the statement more elegant because is generated dynamically. For example currently I have the following statement:

DO $$
BEGIN
DECLARE
    upsert_value Timestamp;
BEGIN
timestamp_value := ISNULL((SELECT "timestamp" FROM system.lastupdatetable_stg WHERE "group_id" = 54 AND "tenant_id" = 23 AND "table_id" = 3327 LIMIT 1), to_timestamp(0));

INSERT INTO public."AccessToDivisions"
("CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId")
SELECT "CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId"
FROM dblink('<connection>',
'SELECT "CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId" FROM public."AccessToDivisions" t
WHERE 1=1 
AND "ModifiedAt" >' + timestamp_value
) AS rt("CreatedAt" timestamp without time zone, "Default" boolean, "Division" integer, "Employee" character(36), "GCRecord" integer, "ModifiedAt" timestamp without time zone, "OID" integer, "OptimisticLockField" integer, "TenantId" character(36))
ON CONFLICT("TenantId", "OID")
DO UPDATE SET "CreatedAt" = excluded."CreatedAt", "Default" = excluded."Default", "Division" = excluded."Division", "Employee" = excluded."Employee", "GCRecord" = excluded."GCRecord", "ModifiedAt" = excluded."ModifiedAt", "OptimisticLockField" = excluded."OptimisticLockField";

INSERT INTO system.lastupdatetable_stg ("group_id", "tenant_id", "table_id", "snapshot", "timestamp") VALUES (54, 23, 3327, false, (SELECT MAX("ModifiedAt") FROM public."AccessToDivisions")) ON CONFLICT ("group_id", "tenant_id", "table_id") DO UPDATE SET "timestamp" = excluded."timestamp";

END;
END $$;

The system.lastupdatetable_stg resides into destination database. Is it possible not to declare a variable upsert_value but instead make the call inside the dblink statement, like the following?

DO $$
BEGIN
INSERT INTO public."AccessToDivisions"
("CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId")
SELECT "CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId"
FROM dblink('<connection>',
'SELECT "CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId" FROM public."AccessToDivisions" t
WHERE 1=1 
AND "ModifiedAt" > ISNULL((SELECT "timestamp" FROM system.lastupdatetable_stg WHERE "group_id" = 54 AND "tenant_id" = 23 AND "table_id" = 3327 LIMIT 1), to_timestamp(0))'
) AS rt("CreatedAt" timestamp without time zone, "Default" boolean, "Division" integer, "Employee" character(36), "GCRecord" integer, "ModifiedAt" timestamp without time zone, "OID" integer, "OptimisticLockField" integer, "TenantId" character(36))
ON CONFLICT("TenantId", "OID")
DO UPDATE SET "CreatedAt" = excluded."CreatedAt", "Default" = excluded."Default", "Division" = excluded."Division", "Employee" = excluded."Employee", "GCRecord" = excluded."GCRecord", "ModifiedAt" = excluded."ModifiedAt", "OptimisticLockField" = excluded."OptimisticLockField";

INSERT INTO system.lastupdatetable_stg ("group_id", "tenant_id", "table_id", "snapshot", "timestamp") VALUES (54, 23, 3327, false, (SELECT MAX("ModifiedAt") FROM public."AccessToDivisions")) ON CONFLICT ("group_id", "tenant_id", "table_id") DO UPDATE SET "timestamp" = excluded."timestamp";
END $$;

The problem in the second query is that the system.lastupdatetable_stg is not found as it is looking into source database.

Is there any way to make the select statement be executed into destination, or any way to pass argument inside dblink instead of simple string interpolation using quotes?


Solution

  • Seems that there no way to execute a command inside dblink which will get something from destination as dblink executes on source. I think that comment od @Frank Heiken partialy solves this issue using foreign table on source database, but at the same time increases the compelxity if you have to deal with hundreds of databases.

    I was able just to make a bit more beautiful the SQL Statement using format and %L parameter as a replacement of quote interpolation, this will simplify also the higher level code (C#) while producing this code:

    DO $$
    BEGIN
    DECLARE
        upsert_value Timestamp;
    BEGIN
    timestamp_value := ISNULL((SELECT "timestamp" FROM system.lastupdatetable_stg WHERE "group_id" = 54 AND "tenant_id" = 23 AND "table_id" = 3327 LIMIT 1), to_timestamp(0));
    
    INSERT INTO public."AccessToDivisions"
    ("CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId")
    SELECT "CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId"
    FROM dblink('<connection>',
    format('SELECT "CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId" FROM public."AccessToDivisions" t
    WHERE 1=1 
    AND "ModifiedAt" > %L', timestamp_value)
    ) AS rt("CreatedAt" timestamp without time zone, "Default" boolean, "Division" integer, "Employee" character(36), "GCRecord" integer, "ModifiedAt" timestamp without time zone, "OID" integer, "OptimisticLockField" integer, "TenantId" character(36))
    ON CONFLICT("TenantId", "OID")
    DO UPDATE SET "CreatedAt" = excluded."CreatedAt", "Default" = excluded."Default", "Division" = excluded."Division", "Employee" = excluded."Employee", "GCRecord" = excluded."GCRecord", "ModifiedAt" = excluded."ModifiedAt", "OptimisticLockField" = excluded."OptimisticLockField";
    
    INSERT INTO system.lastupdatetable_stg ("group_id", "tenant_id", "table_id", "snapshot", "timestamp") VALUES (54, 23, 3327, false, (SELECT MAX("ModifiedAt") FROM public."AccessToDivisions")) ON CONFLICT ("group_id", "tenant_id", "table_id") DO UPDATE SET "timestamp" = excluded."timestamp";
    
    END;
    END $$;
    

    So, at the end seems that format works also inside DBLink as excepted, this way on C# side can be:

    For Full Load:

    FROM dblink('<connection>',
    'SELECT "CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId" FROM public."AccessToDivisions" t
    WHERE 1=1
    

    For incremental Load:

    FROM dblink('<connection>',
    format('SELECT "CreatedAt", "Default", "Division", "Employee", "GCRecord", "ModifiedAt", "OID", "OptimisticLockField", "TenantId" FROM public."AccessToDivisions" t
    WHERE 1=1 
    AND "ModifiedAt" > %L', timestamp_value)