Search code examples
sqlpostgresqlsql-updatehstorepostgresql-9.4

UPDATE existing rows from hstore column in another table


I need to write a query (or function) that will update existing records in a table from with values stored in an hstore column in another table. For example:

create temp table foo(id int primary key, f1 int, f2 text, f3 int);
insert into foo values
  (1, 1, 'jack', 1),
  (2, 2, 'ted' , 2),
  (3, 3, 'fred', 3);

create temp table bar(foo_id int references foo(id), row_data hstore);
insert into bar values
  (1, 'f1=>0, f2=>bill'::hstore),
  (2, 'f1=>0, f2=>will, f3=>0'::hstore),
  (3, 'f3=>0'::hstore);

Only columns that have values in the hstore column should get updated, so after processing, the desired result would be:

select * from foo;
+----+----+------+----+
| id | f1 |  f2  | f3 |
+----+----+------+----+
|  1 |  0 | bill |  1 |
|  2 |  0 | will |  0 |
|  3 |  3 | fred |  0 |
+----+----+------+----+

What is the "best" way to update foo with the values in bar?

Note: I'm defining best as being the easiest to code. While performance is always important, this is a batch job and the speed is not as critical as it might be if a user was waiting on the results.

I'm using PostgreSQL 9.4.


Solution

  • To retain original column values if nothing is supplied in the hstore column ...

    Simple method with COALESCE

    UPDATE foo f
    SET    f1 = COALESCE((b.row_data->'f1')::int, f1)
         , f2 = COALESCE( b.row_data->'f2'      , f2)
         , f3 = COALESCE((b.row_data->'f3')::int, f3)
    FROM   bar b
    WHERE  f.id = b.foo_id
    AND    b.row_data ?| '{f1,f2,f3}'::text[];
    
    • The added last line excludes unaffected rows from the UPDATE right away: the ?| operator checks (per documentation):

    does hstore contain any of the specified keys?

    If that's not the case it's cheapest not to touch the row at all.
    Else, at least one (but not necessarily all!) of the columns receives an UPDATE. That's where COALESCE comes in.

    However, per documentation:

    A value (but not a key) can be an SQL NULL.

    So COALESCE cannot distinguish between two possible meanings of NULL here:

    • The key 'f2'` was not found.
    • b.row_data->'f2' returns NULL as new value for f2.

    Works for NULL values, too

    UPDATE foo f
    SET    f1 = CASE WHEN b.row_data ? 'f1'
                     THEN (b.row_data->'f1')::int ELSE f1 END
         , f2 = CASE WHEN b.row_data ? 'f2'
                     THEN b.row_data->'f2'        ELSE f2 END
         , f3 = CASE WHEN b.row_data ? 'f3'
                     THEN (b.row_data->'f3')::int ELSE f3 END
    FROM   bar b
    WHERE  f.id = b.foo_id
    AND    b.row_data ?| '{f1,f2,f3}'::text[];
    

    The ? operator checks for a single key:

    does hstore contain key?