Search code examples
postgresqlasynchronousdatabase-replicationpostgresql-12

How to monitor async streaming replica delay from the slave?


We have a system with PostgreSQL 12.x where all changes are being written to master database server and two read-only streaming async replicas are used to reduce load from the master server for read-only transactions that can deal with slight delay.

Because the async replica may be delayed from the master in some cases we need a method to query the latency (delay) for the replication. We do not want to contact the master server to do this so one obvious way to do this is to query delay from the replica server:

select
(extract(epoch from now()) - extract(epoch from last_msg_send_time)) * 1000 as delay_ms
from pg_stat_wal_receiver;

However, it seems that pg_stat_wal_receiver has no data for our slave machines. It does have one row but only the pid column has data and every other column is empty. The documentation is unclear about the details but may it be that pg_stat_wal_receiver has data only for sync streaming replica?

Is there a way to figure out streaming delay of async replica? I'm hoping this is just some kind of configuration error instead of "this is not supported".

All the server machines are running PostgreSQL 12.2 but the client machines are still running PostgreSQL 9.5 client library in case it makes a difference.


Solution

  • I think I can answer the question about the missing columns of pg_stat_wal_receiver. To read the rest of columns, you need to login as superuser or a login role being granted the pg_read_all_stats privilege/role.

    This behavior is documented in the source code of walreceiver.c, in the implementation of pg_stat_get_wal_receiver, says:

    ...
    /*
     * Only superusers and members of pg_read_all_stats can see details.
     * Other users only get the pid value to know whether it is a WAL
     * receiver, but no details.
     */
    ...