We're trying to come up with the appropriate RavenDb topology that would allow us to balance load as well as be fault tolerant. It seems that better approach for load balancing would be to use native sharding, we might shift to use it but due to domain peculiarities it is not trivial at this point. In order to have redundancy we just setup 2 ravendb nodes per group with master/master replication between so if one fails, RavenDb client will automatically switches to another one. We have indexing "component" which is the only one who will be writing to the database so it'll be writing to one node and we expect these changes to be distributed eventually. We're going to setup master/master replication between two groups of ravendb nodes so if indexing component will eventually fall back to the group 1, changes should be replicated to the second group.
So, it seems there's low risk of having conflicts since we have only one player who writes to the database (with bunches, once in a minute). Several questions regarding this setup:
1) It is pretty common to have many nodes in such a cluster, yes. Note that you need to setup replication to be changed and replicated in such topology.
2) It is generally easier to have a fully connected topology, rather than the layers you have here..
3) The failover is always based on the client's primary node order of destinations. In other words, if node 2 has destinations (node 1, node 3) and node 1 has destinations (node 3, node 2). A client that was originally connected to node 2 would go to node 1 then node 3 on failover, and a client that was originally connected to node 1 will go to node 3 and then 2 on failover.
4) Round robin and failover behave separately.