Search code examples
mongodbshardingchunks

Why does a monotonically increasing shard key causes inserts to be routed to the chunk with maxkey as upper bound?


I am read in the MongoDB documentation about Shard Keys, the following"

If the shard key value is always increasing, all new inserts are routed to the chunk with maxKey as the upper bound. If the shard key value is always decreasing, all new inserts are routed to the chunk with minKey as the lower bound.

[https://docs.mongodb.com/manual/core/sharding-shard-key/#sharding-shard-key-creation][1]

I don't understand why. Let's say that the shard key ranges from 0 to 75, and is partitioned into 3 chunks. First chunk from 0 to 24, second from 25 to 50, and third from 51 to 75. Here the minKey is 0, and maxKey is 75. If consecutive insert operations occur in a monotonically increasing way, say for example , 1,2,3,4,5 why would these inserts that are addressing shared key values of 1,2,3,4,5 (monotonically increasing), be routed to the last shard that goes from 51 to 75, the last shard? (this is the shard that includes the chunk with the maxKey as the upperbound) ?

Thank you


Solution

  • Let's say that the shard key ranges from 0 to 75 as you said with 3 chunks. First chunk from 0 to 24, second from 25 to 50, and third from 51 to 75.

    If you insert 1, 2, 3, 4, 5 they'll go all to first chunk. Not to last one. Since they are in the range of 0 to 24.

    What the documentation talks about is when you have keys that are always increasing e.g. from 0 to Infinity.

    In this case from 0 to Infinity there may be 3 chunks. First chunk from 0 to 24, second from 25 to 50, and third from 51 to 75. But since the value of the key is always be increasing to Infinity all values > 50 will go to the third chunk and <= 50 values will to to first 2 chunks.

    In sharded clusters there is a Balancer that will try to balance your chunks so they have same amount of data. With keys that are always increasing it can be difficult for the balancer to take a decision about how to split the chunks. So at the first time all inserts with values > 50 are routed to the chunk with maxKey as the upper bound i.e. third chunk.