amazon-web-services amazon-rds read-replication

How to re-sync AWS RDS read replica

Is there a way to fix a read replica that has stopped syncing with the master database? I am already in the process of deleting it and creating a new one since I could not find this answer, but it would be nice to know if it happens again.

The database(s) is a MySQL database with Innodb tables.

Solution

The principle behind MySQL replication is simple: if you start with two identical data sets, and every time you change one, you change the other, then the two data sets will remain identical. That's how MySQL replication works -- you start with two identical servers, either completely blank or with one being an exact snapshot of the other, and replication just executes the same thing on both servers.

Replication is done via the binary log ("binlog"), which captures all changes to the master. In standard MySQL asynchronous replication -- as used in RDS -- the replica has two purpose-specific threads, the I/O thread that connects to the master and captures the replication events from the master's binlog and writes them to a temporary holding area called the relay log, and the SQL thread that reads from the relay log and applies the changes to the replica.

On the replica, the query SHOW SLAVE STATUS; will tell you whether these two threads are running, or not. If they are running, the replica is healthy, though it might be behind the master, as evidenced by the value Seconds_Behind_Master that you'll also find in the output from that query. Otherwise, you'll find the error that has been encountered, causing one or the other threads to stop.

In theory, a MySQL replica will never go out of sync unless one of three things happens:

you do something you shouldn't, to make the replica inconsistent with the master -- such as making the replica writable, and writing to it.
there's a bug in the MySQL source code that causes inconsistency
the replica is disconnected from the master for a sufficiently long period of time such that the master has already discarded some of the replication events that the replica has never seen.

The first issue will cause the SQL thread to halt because it tries to apply a nonsense change -- typically deleting a row that doesn't exist, updating a row that doesn't exist or doesn't match, inserting a row that's already present, etc.

The second issue could cause a problem with either the IO thread or the SQL thread but these should be rare.

The last issue will cause the IO thread to halt because it remembers where it left off on the master, and if no binary log file is available on the master at that point, it is at an impasse. RDS is supposed to prevent this by holding logs on the master until all managed replicas have captured them.

So, the general answer is that you can fix a MySQL read replica by bringing all of its data into exactly the state that it should be in, based on the state of the master at the point in time where the replication SQL thread is currently pointing, in the relay logs.

That's a little bit trickier in RDS because you don't have the SUPER privilege, but it's still possible. Still...

tl;dr: broken replication is only a symptom -- you have to figure out what the actual problem is.

You need to be able to identify what's gone wrong, and take steps to correct it. The problem is, when replication stops, unless you have a very clear understanding of exactly what happened, you don't actually know just how bad things might be on the replica.

Thinking back to the principle mentioned above -- start with two identical data sets, and every time you change one, change the other -- the next thing to note is that MySQL does not have any built-in mechanisms for ensuring consistency in the absence of actual replication errors. Two servers can be dramatically divergent but replication will happily continue until the SQL thread encounters something that it cannot replicate. You need a third party utility that can compare the data on the two servers and call out any discrepancies.

If you clearly understand what went wrong, you can temporarily make the replica writable (using the parameter group setting for the read_only system variable), make the corrective changes, and restart replication. On RDS, you can only restart at the current event pointer by rebooting the replica, since you don't have the SUPER privilege, or you can bring the replica to the state it should have been in after the problematic event replicated, and then use the workaround they provide for that, using CALL mysql.rds_skip_repl_error();. Do not use this without understanding what it does -- specifically, it ignores the failure and moves on to the next event, absolutely leaving your replica in an inconsistent state unless you have manually brought the replica consistent. It should be reserved for emergencies only, when keeping the replica current is more important than keeping the replica correct, because skipping an error essentially guarantees more errors in the future.

Repairing a replica is not a trivial exercise. It's a task for an experienced DBA. In RDS, the best option is typically to discard the replica and create a new one, but since replication errors should never happen -- this is not something you should ever have to do. If you do, you need to discover why.