Search code examples
mongodbshardingreplicasetmongodb-replica-setmongodb-arbiter

why not use arbiter in configdb?


I configured MongoDB Shard Environment and a replica set with the arbiter. I know arbiter is used for replicaset setup in MongoDB cluster zone, but arbiter cannot be used in configdb zone. According to the MongoDB site,

The following restrictions apply to a replica set configuration when used for config servers:

Must have zero arbiters.
Must have no delayed members.
Must build indexes (i.e. no member should have buildIndexes setting set to false).

question. why not use arbiter in configdb? I want to know the detailed reason.


Solution

  • The purpose of an arbiter is to allow a replica set to elect a primary node when there is not a majority of data bearing nodes available to vote.

    This means that by definition, a replica set that needed the arbiter in order to elect a primary is not currently able to process a write that requires a majority of the replica set.

    Majority write is important because it ensures that the write will not be rolled back or lost.

    The config database contains the information about which shard holds each chunk of each sharded collection, and which shard holds unsharded collections for a database.

    If the sharded cluster balancer moves a chunk from one shard to another and updates the config db, and that write to the config db is rolled back for any reason, the entire chunk of data will be unreachable because all of the queries will be looking on the wrong shard.

    To prevent that from occurring, MongoDB uses the write concern level majority for all writes to the config servers.

    So consider if you had a config server replica set with 2 data bearing nodes and an arbiter. If one of the data bearing nodes were offline for any reason, the replica set would have 1 data node and the arbiter, which is 2 votes, enough to elect a primary. However, when the first write comes in and it processed by the primary, it cannot acknowledge that the majority write has been completed until the data is replicated by a second node, which will not happen as long as the other data bearing node is down.

    In this situation, the arbiter provides no benefit because the cluster is still not functional.