We are looking at deploying a clustered db and we would like to scale the number of nodes up and down in the system with some systems having one, two or three nodes. The Percona Clustered DB has a weighted quorum mechanism. I think it is possible to choose the weights to break symmetry so that split brain is always avoided. However, I cannot find any literature saying this is so which makes me doubt myself.
Here is the idea. Assign each node one of the following weights: 127 + 0, 127 + 1, 127 + 2, 127 + 4, 127 + 8, 127 + 16, .. and so on.
Given two disjoint subsets, a and b, of nodes in the cluster the two requirements that I think have to be maintained are:
1) if a subset, a, has more nodes than another b, then the sum of its weights, w(a), will always be greater than the the sum of the weights of the other set..
|a| > |b| --> w(a) > w(b)
2) Symmetry breaking.. given any two subsets then the weights won't be equal. This is the requirement that avoids split brain with an even number of nodes.
|a| = |b| --> w(a) != w(b)
Are there any problems with this approach? If so what are they?
More problem description...
This sounds like it will act as you describe, but appears unnecessarily complicated for what you describe. As far as I can tell from a web search (https://www.percona.com/blog/2015/06/12/percona-xtradb-cluster-quorum-availability-cluster/) when the system is running normally, it knows what the sum of the weights on each node are. After loss of connectivity, a connected subset will only allow writes if it still has MORE than 50% of the weights. In a potential split-brain situation, it is impossible for both sides of the split to believe that they both have MORE than 50% of the weights, because then the sum of the weights from each half would have to add up to more than the sum of all of the weights, which is impossible.
You do avoid both sides going read only if the split is exactly down the middle - but this is not what is normally meant by split brain. https://en.wikipedia.org/wiki/Split-brain_(computing). If your goal is to maximize the probability of some remainder still being able to write after a failure, or to maximize the availability of writes to clients after a failure, my guess is that your best guess is to start considering what likely failures there are, given your physical network topology, and factor this into your calculation. It seems unlikely that each of the possible patterns of unavailablities are equally likely.
If you have an odd number of nodes and give each a weight of one, no split will leave an equal weight in each side, because the total weight must be odd. If you have an even number of nodes, give one node a weight of two and the others a weight of one each. Then you again have a total weight that is odd, so no split can give each half the same weight. Of course, if you lose one node in the split then the remainder can split equally as e.g. A | B,C,D| E,F,G but I'm not sure if you want E,F,G to continue in this situation, because if the split was only two-way as A,B,C,D | E,F,G you would probably want A,B,C,D to continue, and E,F,G can't tell whether the split is A | B,C,D | E,F,G or A,B,C,D | E,F,G