postgresql amazon-rds relational-database amazon-aurora aws-aurora-serverless

How can I update a large table slowly?

I am using PostgresSQL 15 hosted by AWS Aurora serverless v2. I have a table includes 200 millions rows and I need to update a value on a column for all those rows. This table has primary key id which is generated by random uuid. It doesn't have any other index.

When run update Table set column=value on this table, it consumes all database cpu/memory which impacts production traffic. Since the table doesn't have any distribution key, I can't update the rows by page.

I am looking for a way to update it slowly. I don't mind if it takes a few days or a few weeks.

Solution

Do it in small batches

create table already_updated as select id from your_table limit 0;
create unique index on already_updated(id)with(fillfactor=100);

This processes 10k at a time, trying to process the table in chunks, in the order it's written on disk according to the hidden system column ctid:

with selected_batch as(
   select ctid 
   from your_table
   where id not in (table already_updated)
   order by ctid 
   limit 1e4)
,updated_batch as(
   update your_table
   set col1='new_value'
   from selected_batch
   where selected_batch.ctid=your_table.ctid
   returning your_table.id)
insert into already_updated table updated_batch;

If that's really a static value, the same for all 200M records, you don't need the separate progress-tracking table or the consistency of id - the absence of 'new_value' means the row needs the update:

with cte as(select ctid 
            from your_table
            where col1<>'new_value'
            limit 1e4)
update your_table
set col1='new_value'
from cte--,pg_sleep(1.2)--not necessarily advised, extends lock time
where cte.ctid=your_table.ctid;

Same thing if it's not entirely the same for all rows but still possible to tell a not-yet-updated row from one that's already processed in a where. If you're doing this on regular basis to update a field that's calculated based on other fields of that table, consider making it a generated column:

alter table your_table 
   drop column if exists col1
 , add column col1 numeric generated always as (col2*col3) stored;

That'll update on its own whenever any of the other two does.

To add delay between batches, it's best to use a select pg_sleep(1.2); separately, in between them, at the end of each loop. You can add it straight into the update query like I showed above, but apart from slowing down the query, it will extend the amount of time the target rows are locked for update.

Track progress

You can first select count(*) from your_table; then keep subtracting the batch size (10k in the examples above) after each loop.
Guesstimate the progress of each update based on previous ones: if it takes 10s to update 10k rows the first time around, you can interpolate and assume that after running for 4s, the second batch processed 4k rows.
Inject a nextval() into your update and call it from outside to see the progress (sequences are subject to dirty read). That slows the query down a bit:
_{demo at db<>fiddle}

begin; drop sequence if exists progress;
       create sequence progress;
       select nextval('progress');
commit;

create extension if not exists dblink;
select dblink_connect('another_client','');
select dblink_send_query('another_client',
 $q$do $d$
    declare batch_size int:=5000;
            sleep_seconds_between_batches int:=60;
    begin 
      loop
        with cte as(select ctid 
                    from test
                    where z is null
                    limit batch_size)
        update test
        set z='new_value'
        from cte
        where cte.ctid=test.ctid
          and nextval('progress')<>0;
        commit;  
        exit when not found;
        perform pg_sleep(sleep_seconds_between_batches);
      end loop; 
    end $d$;
 $q$);

select elapsed
     , progress
from generate_series(1,8)n
cross join lateral(
    select .002*n*'1s'::interval as elapsed
         , nextval('progress')-n-1 as progress
         , pg_sleep(.002*n));

elapsed	progress
00:00:00.002	187
00:00:00.004	391
00:00:00.006	752
00:00:00.008	1391
00:00:00.01	2238
00:00:00.012	3166
00:00:00.014	4296
00:00:00.016	5000

The other client commits after every batch, so other sessions can also observe the update being progressively applied, directly on the table:

select count(*),count(z) as updated from test;

`View` without having to update

You can also consider hiding the table behind a view if you need an additional column with a static value that's always the same for all rows, or if relatively few need to diverge from it:
_{demo at db<>fiddle}

create table test(id,x)as select generate_series(1,5e5),(random()*1e7)::int;
alter table test add primary key(id);

create table default_value_for_test(id,y)as values(0,'default value');
alter table default_value_for_test 
   add primary key(id)
  ,add constraint only1row check(id=0);
create table diverged_values_for_test(id,y)as values(1,'val1'),(4,'val4');
alter table diverged_values_for_test 
   add primary key(id)
  ,add foreign key(id)references test(id)
    on update cascade on delete cascade;

create view v_test as 
  select t.*, coalesce(diverged_.y,default_.y) as y
  from test as t(id,x) 
  left join diverged_values_for_test as diverged_(id,y)using(id)
  cross join default_value_for_test as default_(id,y);

explain analyze verbose 
update default_value_for_test set y='newer static value';


Update on public.default_value_for_test (actual time=0.048..0.049 rows=0 loops=1)
-> Seq Scan on public.default_value_for_test (actual time=0.009..0.010 rows=1 loops=1)
Execution Time: 0.081 ms

It took under .1ms to update the tiny single-row, single-column table that holds what you want to see as the additional, static column of the big one. Alternatively, you could also keep the default value as a literal constant in the view definition and whenever you need to change it, you use create or replace view to update it.

You're also free to add exceptions and let the value diverge from that for some rows - if there's little traffic on the column, it'll take some time before it's viable to instead make the diverged_ an actual additional column in your table with 200M records, and make the default_ actual default on that column. Inserts and updates on diverged_ should also be way faster than updates on the main table.

If you instead tried to save it in all 500k rows of the test table right now, it would obviously take more time and space:

alter table test add column z text default 'another static value';
explain analyze verbose 
update test set z='another static value';


Update on public.test (actual time=4894.635..4894.636 rows=0 loops=1)
-> Seq Scan on public.test (actual time=0.009..143.322 rows=500000 loops=1)
Execution Time: 4894.668 ms

The cost of having your table split in three and moved behind a view is incurred on reads: the three tables obviously need to be joined together to get the full picture but the price of cross joining the single cell to every row is negligible and left join diverged_ depends on how many diverged.

For batch updates, you can also declare a cursor and use update..where current of your_cursor.

How can I update a large table slowly?

Track progress

View without having to update

`View` without having to update