Search code examples
distributed-computingconsensuspaxosraft

How do replicas coming back online in PAXOS or RAFT catch up?


In consensus algorithms like for example PAXOS and RAFT, a value is proposed, and if a quorum agrees, it's written durably to the data store. What happens to the participants that were unavailable at the time of the quorum? How do they eventually catch up? This seems to be left as an exercise for the reader wherever I look.


Solution

  • Take a look at the Raft protocol. It’s simply built in to the algorithm. If the leader tracks the highest index (matchIndex) and the nextIndex to be sent to each follower, and the leader always sends entries to each follower starting at that follower’s nextIndex, there is no special case needed to handle catching up a follower that was missing when the entry was committed. By its nature, when the restarts, the leader will always begin sending entries to that follower starting with the last entry in its log. Thus the node is caught up.