I'm trying to speed up the process of dumping and restoring databases. At the moment, we do pg_dump --no-privileges --no-owner -w source_db | psql target_db
(the DB is created before this). I read that the custom format is preferable, and I expected a binary format to be faster than a textual representation.
hyperfine --prepare './drop_db.sh target_db ; ./create_db.sh target_db' --runs=3 -i --export-markdown report.md \
'pg_dump --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --no-privileges --no-owner -w source_db | psql target_db'
Command | Mean [s] | Min [s] | Max [s] | Relative | Resources |
---|---|---|---|---|---|
pg_dump --no-privileges --format=custom source_db | pg_restore --dbname=target_db |
693.527 ± 5.741 | 688.635 | 699.848 | 1.03 ± 0.03 | User: 165.459 s, System: 18.331 s |
pg_dump --no-privileges --no-owner -w source_db | psql target_db |
673.223 ± 8.535 | 663.433 | 679.100 | 1.00 | User: 12.348 s, System: 25.697 s |
→ psql is 20 s faster
I tested with different levels of compression because I read that compression is slow.
hyperfine --prepare './drop_db.sh target_db ; ./create_db.sh target_db' --runs=3 -i --export-markdown report.md \
'pg_dump --compress=0 --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --compress=1 --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --compress=5 --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --no-privileges --no-owner -w source_db | psql target_db'
Command | Mean [s] | Min [s] | Max [s] | Relative | Resources |
---|---|---|---|---|---|
pg_dump --compress=0 --no-privileges --format=custom source_db | pg_restore --dbname=target_db |
702.692 ± 6.316 | 698.376 | 709.942 | 1.05 ± 0.03 | User: 18.090 s, System: 28.934 s |
pg_dump --compress=1 --no-privileges --format=custom source_db | pg_restore --dbname=target_db |
672.341 ± 15.539 | 655.448 | 686.025 | 1.00 | User: 89.223 s, System: 21.904 s |
pg_dump --compress=5 --no-privileges --format=custom source_db | pg_restore --dbname=target_db |
693.527 ± 5.741 | 688.635 | 699.848 | 1.03 ± 0.03 | User: 165.459 s, System: 18.331 s |
pg_dump --no-privileges --no-owner -w source_db | psql target_db |
673.223 ± 8.535 | 663.433 | 679.100 | 1.00 ± 0.03 | User: 12.348 s, System: 25.697 s |
→ same performance in the best case
As asked by @Bergi and @Nick I measured dumping and restoring separately.
hyperfine --runs=3 -i --export-markdown report.md \
'pg_dump --compress=0 --no-privileges --format=custom source_db > ./1' \
'pg_dump --compress=1 --no-privileges --format=custom source_db > ./2' \
'pg_dump --no-privileges --no-owner -w source_db > ./3'
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
pg_dump --compress=0 --no-privileges --format=custom source_db > ./1 |
77.648 ± 8.720 | 71.724 | 87.660 | 1.09 ± 0.12 |
pg_dump --compress=1 --no-privileges --format=custom source_db > ./2 |
104.646 ± 0.911 | 103.804 | 105.614 | 1.47 ± 0.02 |
pg_dump --no-privileges --no-owner -w source_db > ./3 |
71.412 ± 1.037 | 70.321 | 72.385 | 1.00 |
→ psql is 7s s faster
hyperfine --prepare './drop_db.sh target_db ; ./create_db.sh target_db' --runs=3 -i --export-markdown report.md \
'cat ./1 | pg_restore --dbname=target_db' \
'cat ./2 | pg_restore --dbname=target_db' \
'cat ./3 | psql target_db'
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
cat ./1 | pg_restore --dbname=target_db |
658.715 ± 24.712 | 637.183 | 685.697 | 1.11 ± 0.04 |
cat ./2 | pg_restore --dbname=target_db |
591.001 ± 8.025 | 582.334 | 598.172 | 1.00 |
cat ./3 | psql target_db |
691.756 ± 15.320 | 680.285 | 709.154 | 1.17 ± 0.03 |
→ restoring is 10x slower than dumping, but is slower when separated from dumping.
Is there anything I can do to make this faster without writing the dump to disk (= using --format=directory
which would enable parallelization)?
You answered the question yourself: use
pg_dump -F c -Z 0 ... | pg_restore ...
That is, don't compress the dump. If plain format dump is slightly faster for you, use that.
Since the restore is slow for you, tune the target database for that job:
increase max_wal_size
a lot
increase maintenance_work_mem
a lot
set wal_level = minimal
and use --single-transaction
with psql
or pg_restore
, so that PostgreSQL can skip writing WAL