Search code examples
postgresqladmin

How to add a not null column to postgresql table without doubling its size on disk


Is there a way to add a not null type column to a Postgresql table without doubling its size on disk? If for instance I have a table with some columns defined, and I want to add a column I'd do the following:

alter table my_table add new_column int  
update table my_table set new_column = 0 
alter table my_table alter column new_column set not null

This, in effect doubles the space that is allocated for a table because of the way Postgresql works. Updates are creating new tuples that will be marked for reuse after this transaction finishes and vacuum does its job. If table is large in size (ie. few million rows) but very slowly growing or is almost constant in size those rows will never be reused, and only a 'vacuum full' or a full database backup and restore will reclaim space on disk. Is there a way to automatically add a column with some default value but without this behavior? For instance if there would be a way to lock a table and do the update then there would be no need for MVCC in this case.


Solution

  • do it in steps:

    1. alter table add new column
    2. alter table add default value for column
    3. update, but not whole table with 1 update statement, but issue it in like 10000 separate updates, each in its own transaction
    4. run vacuum every couple hundred updates, or better - autovacuum
    5. alter table set not null