Search code examples
postgresqlpg-dumppg-restore

Restoring DB using psql faster than pg_restore?


I'm trying to speed up the process of dumping and restoring databases. At the moment, we do pg_dump --no-privileges --no-owner -w source_db | psql target_db (the DB is created before this). I read that the custom format is preferable, and I expected a binary format to be faster than a textual representation.

First test

hyperfine --prepare './drop_db.sh target_db ; ./create_db.sh target_db' --runs=3  -i --export-markdown report.md \
'pg_dump --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --no-privileges --no-owner -w source_db | psql target_db'
Command Mean [s] Min [s] Max [s] Relative Resources
pg_dump --no-privileges --format=custom source_db | pg_restore --dbname=target_db 693.527 ± 5.741 688.635 699.848 1.03 ± 0.03 User: 165.459 s, System: 18.331 s
pg_dump --no-privileges --no-owner -w source_db | psql target_db 673.223 ± 8.535 663.433 679.100 1.00 User: 12.348 s, System: 25.697 s

→ psql is 20 s faster

Compression

I tested with different levels of compression because I read that compression is slow.

hyperfine --prepare './drop_db.sh target_db ; ./create_db.sh target_db' --runs=3  -i --export-markdown report.md \
'pg_dump --compress=0 --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --compress=1 --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --compress=5 --no-privileges --format=custom source_db | pg_restore --dbname=target_db' \
'pg_dump --no-privileges --no-owner -w source_db | psql target_db'
Command Mean [s] Min [s] Max [s] Relative Resources
pg_dump --compress=0 --no-privileges --format=custom source_db | pg_restore --dbname=target_db 702.692 ± 6.316 698.376 709.942 1.05 ± 0.03 User: 18.090 s, System: 28.934 s
pg_dump --compress=1 --no-privileges --format=custom source_db | pg_restore --dbname=target_db 672.341 ± 15.539 655.448 686.025 1.00 User: 89.223 s, System: 21.904 s
pg_dump --compress=5 --no-privileges --format=custom source_db | pg_restore --dbname=target_db 693.527 ± 5.741 688.635 699.848 1.03 ± 0.03 User: 165.459 s, System: 18.331 s
pg_dump --no-privileges --no-owner -w source_db | psql target_db 673.223 ± 8.535 663.433 679.100 1.00 ± 0.03 User: 12.348 s, System: 25.697 s

→ same performance in the best case

Dump / restore separated

As asked by @Bergi and @Nick I measured dumping and restoring separately.

Only dumping

hyperfine --runs=3 -i --export-markdown report.md \
'pg_dump --compress=0 --no-privileges --format=custom source_db > ./1' \
'pg_dump --compress=1 --no-privileges --format=custom source_db > ./2' \
'pg_dump --no-privileges --no-owner -w source_db > ./3'
Command Mean [s] Min [s] Max [s] Relative
pg_dump --compress=0 --no-privileges --format=custom source_db > ./1 77.648 ± 8.720 71.724 87.660 1.09 ± 0.12
pg_dump --compress=1 --no-privileges --format=custom source_db > ./2 104.646 ± 0.911 103.804 105.614 1.47 ± 0.02
pg_dump --no-privileges --no-owner -w source_db > ./3 71.412 ± 1.037 70.321 72.385 1.00

→ psql is 7s s faster

Only restoring

hyperfine --prepare './drop_db.sh target_db ; ./create_db.sh target_db' --runs=3  -i --export-markdown report.md \
'cat ./1 | pg_restore --dbname=target_db' \
'cat ./2 | pg_restore --dbname=target_db' \
'cat ./3 | psql target_db'
Command Mean [s] Min [s] Max [s] Relative
cat ./1 | pg_restore --dbname=target_db 658.715 ± 24.712 637.183 685.697 1.11 ± 0.04
cat ./2 | pg_restore --dbname=target_db 591.001 ± 8.025 582.334 598.172 1.00
cat ./3 | psql target_db 691.756 ± 15.320 680.285 709.154 1.17 ± 0.03

→ restoring is 10x slower than dumping, but is slower when separated from dumping.


Is there anything I can do to make this faster without writing the dump to disk (= using --format=directory which would enable parallelization)?


Solution

  • You answered the question yourself: use

    pg_dump -F c -Z 0 ... | pg_restore ...
    

    That is, don't compress the dump. If plain format dump is slightly faster for you, use that.

    Since the restore is slow for you, tune the target database for that job:

    • increase max_wal_size a lot

    • increase maintenance_work_mem a lot

    • set wal_level = minimal and use --single-transaction with psql or pg_restore, so that PostgreSQL can skip writing WAL