Search code examples
postgresqlazurewal

PostgreSQL / WAL-archiving: can I leave archive_command empty when doing image snapshot backups?


I have a PostgreSQL 9.5 instance running off an Azure VM. As described here, I must specify a post- and a prescript to tell Azure: "Yes, I've taken care of putting the VM in a state, so the entire VM/blob can be backed up as a snapshot that can be restored as a working new VM" and "Now I'm done", thus Azure will flag the backup as Application consistent.

In terms of PostgreSQL, I have read the docs on continuous archiving, that instruct why and how to enable WAL Archiving to allow for backups. And here comes my question:

If I set archive_mode = on and wal_level = archive, can I leave the archive_command empty, and does this even make sense? Or - should I do some kind of archiving here (like e.g. copying the log segments to another location / disk), and is this archiving necessary to ensure a working database upon restoring the VM in my scenario?

I only need to tell the PostgreSQL "Wait a minute / hold your data-writes (or whatever goes on), while I create a snapshot of the entire VM". The plan is to execute pg_start_backup() before , take the snapshot and then pg_stop_backup().

I do realize, this method (if it's even valid) is essentially a file system level backup, and according to docs, the postgres-service must be shut down for the fs-backup to be valid. Another place I've read that hitting the pg_start_backup() should be enough to guarantee for a valid stand-alone physical backup.


Solution

  • If the snapshots you plan to take are truly atomic, that is, the restored snapshot represents the state of the file system at some point in time, you can just restart the database from such a snapshot, and it will perform crash recovery and come up in a consistent state.

    In that case, there is no need to care about WAL archiving or backup mode. You could set archive_mod = off and not worry about it.

    If the snapshot is not truly atomic, or you want point-in-time-recovery (the ability to restore the database to a point in time between backups), you need WAL archiving set up and running, because you need the WALs to restore the database to a consistent state.

    In that case archive_mode must be on and archive_command must be a command that returns success only if the WAL file has been archived successfully. If only one WAL is missing between your last backup and the time to which you want to restore the database, it will not work.