Search code examples
postgresqlpostgresql-13

PostgreSQL - Does a single archive file contain information for only a specific database on a cluster or is it the entire cluster?


Note: this question is with regards to PostgreSQL version 13.
On my Ubuntu server, I have a cluster called main which has 2 databases inside it (the first one being for a fruits company and the second one for a car company).

Here are my postgresql.conf file settings:

wal_level = replica
archive_mode = on
archive_command = 'pxz --compress --keep --force -6 --to-stdout --quiet %p > /datadrive/postgresql/13/wal_archives/%f.xz'

This creates .xz files in /datadrive/postgresql/13/wal_archives/ as expected. For example: a file name may look like this:

0000000100000460000000A4.xz

Now my question's regarding this archiving process are as follows:

  1. Is this particular .xz file an archive of all the databases in the postgresql cluster? i.e. does this particular xz file contain an archive for both the fruits and the car databases or does it only contain an archive for only one of them?
  2. What is an archive file? Is it just a single WAL file or is it an archive point + a WAL file?

I have read the official documentation found here and here and also looked at a large number of stackoverflow and database stack exchange questions and have not managed to gain a good understanding of the archive concept.


Solution

  • Such a file is called a "WAL segment". WAL is short for "write ahead log" and is the transaction log, which contains the information required to replay data modifications for the whole database cluster. So it contains data for all databases in the cluster.

    WAL is an endless append-only stream, which is split into segments of a fixed size. A WAL archive is nothing more than a faithful copy of a WAL segment.

    WAL archives are used together with a base backup to perform point-in-time-recovery. Other uses for WAL files are crash recovery and replication, but these don't require archived WAL segments.