Search code examples
mongodbfault-tolerancereplicaset

Best way to work with replica sets in MongoDB using 2 servers only


I am going to use a 2 servers solution for my production environment that uses MongoDB.

If I understand correctly I can have 1 replica set with 2 nodes, one in each server. Now in order for the fault tolerance to re-assignate a new primary node, I need an arbiter node.

Since I still want to use 2 servers, if the server that is holding the arbiter node goes down, there would be no way of setting the new primary.

A solution I came up is to have 3 arbiter nodes. 1 in one server and the other 2 in the other. That way if any server goes down, the other server's non-arbiter node will become primary.

Is this correct? Should I use another solution?

Thanks! Ignacio.


Solution

  • Your approach wouldn't work either. If the machine with the 2 arbiters goes down, the other machine will only be able to cast two votes for its replica. But three votes are needed in this configuration to elect a primary. This is no better than picking just one of those machines to host a single arbiter.

    Primary election cannot be guaranteed with an even number of nodes. You must either tolerate the possibility of not having a primary in your app (goes into read-only mode), or add a third machine.

    With only two machines, I would consider putting the arbiter on the better of the two machines (which should be hosting the primary anyway). In this setup,

    1. When the secondary becomes unavailable, the primary will continue to function.
    2. When the primary becomes unavailable but the arbiter is still up, the secondary will become primary.
    3. When the primary and arbiter both become unavailable (e.g. that whole machine becomes unavailable) then the secondary will remain secondary and there will be no primary.

    I've found scenario 3 to be more common than scenario 2. To account for this, you might consider using MongoMMS (or something else) to monitor the cluster so that any node unavailabilities (particularly scenario 3) can be investigated asap.