Image a table with a generated column like this:
CREATE OR REPLACE FUNCTION extract_first_name(p_name text) RETURNS text
LANGUAGE SQL IMMUTABLE
AS $$
SELECT split_part(p_name, ' ', 1);
$$;
CREATE TABLE customers (
id serial,
name text,
first_name text GENERATED ALWAYS AS (extract_first_name(name)) STORED
);
Later someone finds that the extract_first_name
function is too simplistic and needs to be changed. It's then updated, but the values in the first_name
column stays the same. How do we in the simplest and most efficient way recompute all the values in the first_name
column to use the latest version of the function w/o locking the table?
A generated column is 'computed' on 2 events. When the row is Inserted and when the underlying column is Updated. Thus you need to Update every row with something like
Update customers
set name = name;
This will, as needed, update every row in the table. But as Postgres uses a MVCC model this will not block others from selecting during the operation.
The main advantage of using the MVCC model of concurrency control rather than locking is that in MVCC locks acquired for querying (reading) data do not conflict with locks acquired for writing data, and so reading never blocks writing and writing never blocks reading. PostgreSQL maintains this guarantee even when providing the strictest level of transaction isolation through the use of an innovative Serializable Snapshot Isolation (SSI) level.
However, you will need to lookout for deadlocks if you continue to allow updates during this process.