I have a Riak cluster (of 3 nodes, with 64 partitions and n_val = 3) but I find that for some objects, their hosting partitions / vnodes are not spread out across the 3 nodes. In some cases, 2 of them are on 1 node and the third is on a second node. That runs contrary to my understanding (link here: http://docs.basho.com/riak/kv/2.1.4/learn/concepts/clusters/) that the data is spread out across partitions in such a way that the partitions are on different servers. Is there something I'm missing here please in terms of how Riak works? Thanks...
When storing a value at a specific bucket/key, Riak hashes the bucket/key pair to obtain a 160-bit value to determine where it should be stored. The entire hash space is evenly divided into partitions, identified by the index in the hash space, which are assigned to physical nodes. For n_val=3, the value is stored in the next 3 higher numbered partitions.
While joining nodes to cluster Riak attempts to assign the partitions to nodes such that only one of any 3 consecutive partitions is assigned to the same physical node.
Since the only prime factor of 2 raised to the 160th power is 2, the number of partitions is also power of 2.
The hash space is treated as a ring, so partition 0 immediately follows the highest numbered partition.
There is no possible configuration of assigning a power of 2 partitions to 3 nodes without violating the 'only 1 of any 3 consecutive partitions ' rule. The cluster plan should have included a message similar to "not all replicas will be on distinct nodes" to let you know that was happening at the time you set up the cluster.