what decides the roles in Paxos?

When people describe Paxos, they always assume that there are already some proposers in the cluster. But where are the proposers from, or what decides which processes to be proposers?

Solution

How the cluster is initially configured and how it is changed is down to the administrator who is trying to optimise the system.

You can run the different roles on different hosts and have different numbers of them. We could run three proposers, five acceptor and seven learners, whatever you choose. Clients that need to write a value only need to connect to proposers. With multi-Paxos for state replication clients only need to connect to proposers as that is sufficient and the clients don't need to exchange messages with any other role type. Yet there is nothing to prevent clients from also being learners by seeing messages from acceptor.

As long as you follow the Paxos algorithm it all comes down to minimising network hops (latency and bandwidth), costs of hardware, and complexity of the software for your particular workload.

From a practical perspective your clients need to be able to find proposers in the face of failures. A cluster administrator will be configuring which nodes are to be proposes and making sure that they are discovered by clients.

It is hard to visualize from descriptions of the abstract algorithm how things might work as many messaging topographies are possible. When applying the algorithm to a practical application its fair more obvious what setup minimises latency, bandwidth, hardware and complexity. An example might be a three node MySQL cluster running Paxos. You want all three servers to have all the data so they are all learners. All must be acceptors as you need three at a minimum to have one node fail and still maintain progress. They may as well all be proposers to give the best availability and simplicity of software and configuration. Note that one will become the distinguished leader. The database administrator doesn't think about the Paxos roles as they just set up a three-node database cluster.

The roles in the cluster may need to change. For example, you might want to expand the capacity of a database cluster. Or a server might die so you need to change the cluster membership to swap the dead one for a fresh one. For the Paxos algorithm to work every process must have a strongly consistent view of which processes are in which roles. How do you get consensus? You use Paxos to fix a new value of the cluster membership.