Search code examples
sqlpostgresqlpg-dumpreindexpg-restore

In Postgres, after truncating or deleting rows, does executing reindex before pg_dump optimize pg_restore?


Do the REINDEX statements below help with the restore operation or file size of the dump?

Could not find a question about this anywhere on SO or web. I'm using Postgres 9.4 and cleaning out a very large database with both truncate and delete statements on various tables.

The table data varies in type and size.

After this clean up operation, I immediately execute a pg_dump, tar and upload, then pg_restore. It's using the directory format with 12 jobs in parallel for dump, 8 for restore.

For example, these queries first:

TRUNCATE users;
DELETE FROM users_email WHERE active = 1;

REINDEX TABLE users;
REINDEX TABLE users_email;

Then:

$ pg_dump_9.4 --compress=0 -F directory -j 12 $DB_EXPORT_NAME -f $DB_DUMP_FOLDER 2>> operations.log

$ # do tar and upload with dump then:

$ pg_restore_9.4 -d $DB_IMPORT_NAME -j 8 $DB_DUMP_FOLDER 2>> operations.log

Solution

  • This will make no difference at all for pg_dump or pg_restore.

    pg_dump doesn't use the index at all, it just writes its definition as a CREATE INDEX statement into the dump. The table itself is scanned sequentially.

    pg_restore creates the index using the CREATE INDEX from the dump.