Search code examples
postgresqlkubernetesreplicasetlogical-replicationpg-cron

Kubernetes StatefulSet Database Replicas


I am wondering about what it happens when we have a database like PostgreSQL into Kubernetes stateful set with configured replicas > 1. My question may be obvious or just completely crazy.

The obvious reason to have multiple database replicas is for load-balancing (scale out) or for not-single-point-of-failure.

But, what happens with some other things like logical-replication or pg-cron? So if in the database we have subscriptions (from some other databases / publishers) will this process take place > 1 times? In the other hand, if in the database we have functions or procedures which are scheduled to be triggered each hour by pg-cron will this process take place > 1 times for each replica?

And this will happen if the two or more replicas are pointing into the same PV, so to avoid this we should use volume claim templates. But then everything will be different (from data perspective) between replicas.

And here is the time when we should use master-slave replication between database replicas?

Is the above correct?


Solution

  • Having a statefulset running a database with replicas > 1 is nothing different (from a high level) than running 2 instances of a database on 2 different servers.

    You don't get any magic behaviour out of the box: if you need an HA setup, you need to care of replication, failover and everything around that.

    If you want to go down this route there are projects out there aiming at running HA postgres on kubernetes, for example https://postgres-operator.readthedocs.io/en/latest/