Search code examples
djangodatabasepostgresqldatabase-backupsdumpdata

Django's "dumpdata" or Postgres' "pg_dump"?


I'm unsure as to whether this question should be posted in the Database Administrators' section or here, so please advise if I got it wrong.

I have a Django-based website which doesn't change much. I use python manage.py dumpdata --all --indent=2 > backup.json and reload the data with loaddata if I need to redeploy or the db gets corrupted. (I'm aware about integrity errors that have occurred when not excluding auth and content_types)

Since I'm using PostgreSQL on the backend, is it "best practise" or "wiser" for me to use pg_dump instead, and then pg_restore if something goes wrong or if I need to redeploy?

So dumpdata dumps all data associated with the selected apps (and/or models), and pg_dump performs a full dump of the db. Is this the same thing or is there a fundamental difference that I've missed (mind you I have 0 experience with DBA)?

Which option do I go for and why?


Solution

  • It is both best practice and wiser for you to use pg_dump instead of dumpdata.

    There are many reasons for this.

    • pg_dump is faster and the output is more compact (particularly with the -Fc option) than with dumpdata.

    • Importing the data back into the db with pg_restore will also be faster than django's loaddata.

    • pg_restore is available on any postgresql installation but django and it's dependencies you will have to install.

    • Last but not least the integrity errors that you spoke of will not happen with pg_dump/pg_restore.

    Generally pg_dump is used to dump the entire database however the -t option allows you to dump one or few tables at a time