Search code examples
databasebackupcommunicationfailover

Failover strategy for database application


I've got a writing and reading database application holding a local cache. In case of an application server fault a backup server shall start working.

The primary and backup application can only run exclusively because of its local cache and some low isolation level on the database.

As far as my communication knowledge goes it is impossible to let both servers always figure out who is allowed to run exclusively.

Can I somehow solve this communication conflict through using the database as a third entity? I think this is a quite typical problem and there might not be a 100% safe method, but I would be happy to know how other people recommend to solve such issues? Or if there is some best practice to this.

It's okay if both application are not working for 30 minutes or so, but there is not enough time to get people out of bed and let them figure out what the problem is.


Solution

  • Can you set up a third server which is monitoring both application servers for health? This server could then decide appropriately in case one of the servers appears to be gone: Instruct the hot standby to start processing.