Search code examples
mongodbsharding

Mongodb 2.0.5 sharding no longer balanced


I started to see a problem a few days ago where the data going into mongo no longer spreads across the cluster. Everything except a very small amount of data is going to the database's primary shard even though the shard key hasn't changed. The shard keys are MD5 sums of another document field, similar to the hashed shard keys in v2.4 of mongo, so there should be more than enough variability to spray documents across the three shards. We're running 2.0.5 currently.

I can't find anything in the config database that would indicate why the documents are only going to the primary. We create a new collection each day and write roughly 40 million documents into it. I've verified they are sharded, and the balancer is slowly moving chunks from the primary but not fast enough to keep up with the write rate.

Each server that writes into mongo has its own mongos instance, and there are a few mongos instances for processes that read data. The total number we have running is a bit over 25. Could the number of mongos instances cause this problem? It seems like I need to manually define the ranges for sharding, but that seems problematic to me. I'd like to keep auto-sharding in place. Where in the config database does this information reside? Is it possible to see what the ranges are that the mongos instances use for auto-sharding?


Solution

  • Based on what I've found, a newly created collection that is sharded will not have any chunks directed to shards other than the primary. It isn't until chunks have been split that data will be directed elsewhere. For high volume insertions into Mongo this can cause problems. It's possible that an increase in overall volume is keeping the system from distributing data properly, since previous collections were able to go across the three shards properly.

    The solution is to pre-split the collection based on knowledge of the shard key(s). How to define sharding range for each shard in Mongo? gives information on how to do that properly.