Getting error `replication slot "pgl_testdb_pgnode_pdaaa79d_sub1" does not exist`

I have a 3 node cluster having logical replication enabled and subscriber connected to the virtual IP pointing to the current leader/master under the cluster setup. Data is getting streaming/replicated to the subscriber.

Whenever the master node goes down and one of the replicas promotes itself as master in that case logical replication stops stating below error

2021-04-13T09:32:12.912262+00:00 host2 postgres_2[13527]: [7-1] pid=13527,session=6075651c.34d7,line=1,sqlstate=42704,user_app=sub1,user=bpuser,db=testdb,client=10.186.34.182,txId=0 ERROR: replication slot "pgl_testdb_pgnode_pdaaa79d_sub1" does not exist

This error occurs on the new Master, I am using postgres 12

Solution

Replication slots exist only on the primary server, so they are lost when the primary server goes down (and doesn't come up again).

The only safe way I can think of to recover is to build the logical replication standbys from scratch after a failover. I don't think that logical replication can be used for a good high-availability solution.