Search code examples
postgresqlpg-restore

pg_restore -F d -j {n} internal steps and estimation of time


I'm running a looooong pg_restore process of a database with 70 tables and 800Gb. The process is taking 5 days now. I'm monitoring some aspects of the process to evaluate how long will it take but I've some things missing and this is why I'm asking.

I run pg_dump with parameters -F d -j 10 the dump took about 12 hours. I noticed each one of the 10 threads took responsibility of a single table from start to end. After ending of processing a single table, the same process (pid) started with another table not taken by another process.

Running pg_restore is taking much longer (5 days and still working). The main reason is that I'm restoring to a NAS external drive mounted using nfs and that drive is very slow compared to a local hard drive. This is NOT a problem, I'll migrate the information back from the NAS to the original hard drive once I format the hard drive again and install the new operating system.

I'm doing two things to monitor progress:

  • In a separate terminal I launch du -sh /var/lib/pgsql and evaluate the disk space consumed in the new installation. It has to reach, more or less, the same space the original database was using.
  • In a separate terminal I launch ps -fu postgress and I see several pg_restore processes running. Each one of then linked with another process with this shape postgress: postress {dbname} [local] {command} where {dbname} is the database name, and {command} varies. Initially, there was the COPY command I think that was used to restore the table content. I also saw some CREATE INDEX commands for re-creating the indexes of that table, and now I see ALTER TABLE commands, don't know exactly for what.

At this time, all processes are just doing ALTER TABLE and the overall used space almost matches the initial space, but the process does not ends (and it is taking 5 days now).

So I'm asking if someone has more experience and can tell me what pg_restore is doing with the ALTER_TABLE command and if there is any other mechanism to estimate how long will it take.

Thanks! Ignacio


Solution

  • The ALTER TABLE statements at the end of a pg_restore create primary and unique keys as well as foreign key constraints. They could also be attaching partitions, but that is normally very fast.

    Look into pg_stat_progress_create_index if you have a recent enough PostgreSQL version (you didn't say), then you can monitor the progress of primary and unique key indexes being created.