Search code examples
paxos

How does one handled skipped event numbers with Paxos?


If we are running multi-paxos then a node may see:

Propose(N)
Accept!(N,Vn)
Accept!(N+1,Vm)
Accept!(N+4,Vo) // huh? where is +2, +3?
Accept!(N+5,Vp)

This may be because either:

  • There was a stable leader but the network local to this node dropped else delayed +2 and +3.
  • There was an outage such that there were two attempts to propose such that +2 and +3 were failed rounds proposals

In general operations on the distributed finite state machine wont commute such that a node should apply all operations in order. This implies that a node needs to be able to distinguish between the two cases. If it is failed rounds of proposals the node has no problems. If it is lost messages it suggests that the node should wait till they turn up else try to recover the lost data (e.g. request a snapshot to reinitialise and catchup).

What are the options or strategies to handle this and what overhead do they create?

This question is inspired by In Paxos, can an Acceptor accept a different value after it has already accepted one?


Solution

  • I can think of two methods to deal with this.

    The simplest approach would be to have the node that is missing +2 and +3 to go back and try to propose no-ops in those slots. If there were decisions there, the node will learn the data in the prepare round. Otherwise, no-ops will be decided.

    Another approach would be to have an out-of-band re-learning process. This may be necessary anyway: how does a node catch up if it joins the system after the others?

    Or you can use a combination of both. The leader can propose no-ops for any holes in its history, the others can use the re-learning process. This is how my paxos system works.