I'm currently pushing local data up to a AWS RDS db - but because of the data size, this is taking a long time. Was hoping someone might be able to suggest a way to get the dump file compressed before sending up (and then getting it read at the other end)
Currently the dump file is being written thus:
pg_dump -t properties_staging --data-only project_development > data/properties_staging.dump
And pushed up to AWS this way:
psql -a -h blahblahblah.blahblahblah.eu-west-1.rds.amazonaws.com -U master -d ebdb -p 5432 < data/properties_staging.dump
What are the best options available for slimming this dump file down while still punting it up to RDS?
All suggestions very gratefully received.
Thanks
pg_dump
supports several options that may be faster and/or result in smaller files. From the documentation, some options to try. Note this is very dependent on your data, and some choices will require you to use pg_restore
instead of psql
.
--format=custom
use a binary format, compressed by default.--format=directory
create multiple files in dump. compressed by default. Allows parallel dumps.--jobs=N
use N threads for dump and compression. May speed up operation.--compress=N
number from 0 to 9. Zero is no compression, one is low-cpu compression, 9 is high-cpu compression. Here's a quick summary of what this means.Again, what will be the fastest? I don't know. Test and benchmark for your specific situation. It'll likely be faster if you copy the output of pg_dump to an EC2 instance (with a fairly fast CPU, not a micro or t2) in the same AZ and run pg_restore
or psql
from there.
Note pg_restore
has a --jobs=N
flag too. This setting doesn't need to match the one in pg_dump
.