Search code examples
postgresqldblinkpostgres-fdw

Postgresql and dblink: how do I do an UPDATE FROM?


Here's what works already, but it's using a loop:

(I am updating the nickname, slug field on the remote table for each row in a local table)

DECLARE
row_ record;
rdbname_ varchar;
....
/* select from local */
FOR row_ IN SELECT rdbname, objectvalue1 as keyhash, cvalue1 as slug, cvalue2 as nickname
    FROM bme_tag
    where rdbname = rdbname_
    and   tagtype = 'NAME'
    and  wkseq = 0

    LOOP
        /* update remote */
        PERFORM dblink_exec('sysdb',
            format(
                'update bme_usergroup
                set nickname = %L
                ,slug = %L
                where rdbname = %L
                and   wkseq = 0 
                and   keyhash = %L'
                , row_.nickname, row_.slug, row_.rdbname, row_.keyhash)
            );

END LOOP;

Now, what I would like to do instead is to do a bulk UPDATE (remote) FROM (local)

PERFORM dblink_exec('sysdb',
'update (remote)bme_usergroup
set    nickname = bme_tag.cvalue2, slug=bme_tag.cvalue1
from   (local).bme_tag s
where  bme_usergroup.rdbname        = %L
and    bme_usergroup.wkseq          = 0
and    bme_usergroup.keyhash        = s.keyhash
and    bme_usergroup.rdbname        = s.rdbname
)

I've gotten this far by looking a various solutions (postgresql: INSERT INTO ... (SELECT * ...)) and I know how to separate the remote and local tables of the query in the context of SELECT, DELETE and even INSERT/SELECT. And I can do that direct update with bind variables too. But how about UPDATE FROM?

If it's not possible, should I look into Postgres's FOREIGN TABLE or something similar?

The local and remote db are both on the same Postgres server. One additional bit of information, if it matters, is that either database may be dropped and restored separately from the other, and I'd prefer a lightweight solution that doesn't take a lot of configuration each time to reestablish communication.


Solution

  • Yes, you should use foreign tables with postgres_fdw.

    That way you could just write your UPDATE statement like you would for a local table.

    This should definitely be faster, but you might still be exchanging a lot of data between the databases.

    If that's an option, it will probably be fastest to run the statement on the database where the updated table is and define the other table as a foreign table. That way you will probably avoid fetching and then sending the table data.

    Use EXPLAIN to see what exactly happens!