distributed-computing distributed distributed-system consensus paxos

Paxos understanding

I have read the paper Paxos made simple.

And after hard thinking, I come to this conclusion:

The Paxos protocol always guarantees the majority of servers accept the same value at the same turn of the proposal and so that finally, we can derive the consensus from the majority of servers.

For example, table 1 shows that there are three servers(S1, S2, S3), and the N1, N2... denotes the proposal, there are many gaps in the proposals but for every proposal, we can derive the consensus by the majority servers.

N1: S1(A)    & S2(A)    & S3(A)    => A

N2: S1(B)    & S2(null) & S3(B)    => B

N3: S1(null) & S2(C)    & S3(C)    => C

N4: S1(D)    & S2(D)    & S3(null) => D

So we get consensus as table 2 shown.

Is my understanding right?

Table 1:

	N1	N2	N3	N4
S1	A	B		D
S2	A		C	D
S3	A	B	C

Table 2:

N1	N2	N3	N4
A	B	C	D

Solution

From what you have said I think you are seeing whether or not Consensus on a sequence of values is obtained from looking at the values accepted in increasing proposals. Follow this, the log is obtained from the proposal numbers ((N1, A), (N2, B), (N3, C), (N4, B)). I think this is the case because you refer to gaps in the sequence of proposals.

If this is the case, then I think there might be some confusion.

In Paxos, there are two things to worry about:

Proposals (or ballots, attempts to choose a value)
Log values (values agreed on by the acceptors and then executed by a state machine)

For the log, there must be fault-tolerant agreement on each value. A value can be executed only if all previous values have also been executed, therefore there must be no gaps in the log if you wish to execute all values. This is a liveness issue though and has nothing to do with safety. You can safely learn of all the values in the log and see them as "committed". If it were a machine executing commands issued by clients, you could tell a client their request for a command will occur but you couldn't tell them the result of executing their command as other commands that you do not know of will be executed before it.

The log uses the basic Paxos protocol to provide the fault-tolerant agreement on a single value. This is where the idea of committing a value is useful. The basic Paxos protocol outlined in Paxos Made Simple allows for agreement on only a single value and it uses a totally ordered sequence of proposals (ballots)to agree on that single value.

The main thing here is that there are can be multiple proposals in deciding on one value to be at a single index in the log.

Following that, there might be gaps in the sequence of proposals promised or accepted by acceptors in Paxos but this is okay and part of the algorithm's operation. The Paxos algorithm ensures that if a value is agreed upon in one proposal, all higher-numbered proposals will see the same value proposed and therefore agreed upon. Please note this is different from the example in Tables 1 and 2 of the question. In that example, if a majority of servers (acceptors) had accepted A in proposal N1, then all later proposals would also accept value N1.

This ability to agree on multiple proposals but with the same is necessary for fault tolerance, in particular for liveness. Imagine if a proposal Nl was accepted by a majority of acceptors, and then all but one acceptor crashed before any notification of acceptances of Nl could be made. A value has been agreed upon but no one knows that. Now, to recover from this, another proposer should issue a higher-numbered proposal. Let's call this proposal Nh.

Making a proposal has two phases: Promise and Accept. In the first phase, the proposer must obtain promises from a majority (or quorum) of acceptors on the proposal Nh. The purpose of this is to determine whether or not a value was (possibly) agreed upon in a lower-numbered proposal. In this case, we will learn that one acceptor had accepted a value in a lower-numbered proposal, Nl. (The other acceptors who responded will report they've accepted nothing.) Despite only one acceptor reporting they've accepted a value in Nl, the proposer must (for agreement-safety reasons) assume it was possibly chosen. In the second phase, then, the proposer will issue acceptance requests to the acceptors for the proposal Nh, and will include the value of the earlier proposal Nl. If at least a majority (quorum) of acceptors accept this, then the proposal is chosen and all things being well, learners will learn that value to the value committed for a single index in the log. Otherwise, the process of issuing proposals for that to decide on the value at the log index will continue again.

Finally, the last point to raise is there is also no requirement that a proposal has a majority of acceptance and this is important because there might not be a majority of acceptors non-faulty at the time a proposal is issued. To motivate this consider if a proper, or too many acceptors fail during a proposal.

Imagine if in an instance of the Paxos algorithm there are there acceptors, and one of them accepts a value in proposal Na. Then before Na can be accepted by a majority, the proposer of Na fails and any messages in transit over the network are lost. In this case, proposers need to be able to make new proposals and see a value chosen. In the scenario, another proposer will begin a higher numbered proposal Nb. It will go through the promise phase as before and determine no value could possibly be chosen after talking to the two acceptors who didn't accept anything. Now the proposer of Nb can propose any value and see it accepted by a majority and agreed upon. Proposing any value here is safe because the proposer of the Na, should it come back online, will not have their value accepted by a majority because a majority of acceptors had promised and accepted on a higher-numbered proposal.

Michael :)