I am using pg_restore
to restore the Postgresql database backed up using pg_dump
. (both the backup and restore is happening on PostgreSQL 12). The backup is taken using the custom format option --format=custom
.
The restore for 10 GB backup is taking almost 30 mins. The time to restore increases significantly as the size of the backup increases. So I tried out the --jobs
parameter for improving the restore times.
As per the documentation, concurrent connections will be used to restore the database objects. I verified the output of restore and I could verify that there were parallel threads started equal to the value of the --jobs
parameter. However the time to restore has not improved with any value of the --jobs
parameter.
I am aware that the performance depends on the hardware infrastructure. But the machine has 16 vcpus and 32GB RAM.
I have also tried tuning Postgres as mentioned in the blog with following configurations but still no improvements in restore times.
work_mem = 32MB shared_buffers = 4GB maintenance_work_mem = 2GB full_page_writes = off autovacuum = off wal_buffers = -1
Is there anything that I have missed? How do I get an improvement in the restore time?
There are several things that could be the problem:
pg_restore
parallelizes by running several COPY
and CREATE INDEX
commands concurrently.
Now if your database has one large table with a single large index, parallelization won't help you.
Perhaps your I/O system is at its limit. Then running processes in parallel won't improve the performance.
If you have lots of large objects, that is known to slow down processing, and I am not sure if parallelization helps or not.
Don't ever set full_page_writes
or autovacuum
to off
unless you set them back to on
after restore and are ready to dump the database cluster in the event of a crash. I doubt that the performance gain is worth it, particularly for full_page_writes
.
One parameter that you forgot is max_wal_size
. If you raise that, it will help write performance.
Apart from that, you have to find out where the bottleneck is before you can fix it.
How about using a different backup method like pg_basebackup
that is normally faster?