Search code examples
mongodbspring-bootpartitioningsharding

How shard a collection with data inside sharded cluster MongoDB


I want to shard a collection with data. When I try with sh.shardCollection("myDb.myCollection", {id:"hashed"}) then this collection shard but it's not spread to the whole shards. only spread to the primary shard. for example,

Empty collection after shard,

sh.status() result

Then data add it will spread to whole shards

Collection with data after shard,

sh.status() result

When data add only goes to the primary shard.

My question is how correctly shard a collection with data in MongoDB. Have any other alternative way?


Solution

  • I agree with @Wernfried Domscheit in the comments about the fact that the cluster will take care of distributing the data once the collection is sharded. As mentioned, that is done based on writing to the collection and happens over time. Your test may have too little data or too few writes to trigger the changes.

    To your specific question about the initial distribution of chunks, this is covered in the documentation. Applying a hashed shard key on an empty collection in your first example is covered here:

    The sharding operation creates empty chunks to cover the entire range of the shard key values and performs an initial chunk distribution. By default, the operation creates 2 chunks per shard and migrates across the cluster. You can use numInitialChunks option to specify a different number of initial chunks. This initial creation and distribution of chunks allows for faster setup of sharding.

    And behavior on the collection with data is covered just above it here:

    The sharding operation creates the initial chunk(s) to cover the entire range of the shard key values. The number of chunks created depends on the configured chunk size.

    Both of these described behaviors match what you have demonstrated in your question.