I have a little cluster which consists of several shards, and every shard is a replica set of 2 real nodes and 1 ARBITER. sharding is enabled on a collection, let's say generator_v1_food.
I've stopped all the programs updating the collection (in these programs, there are ONLY upsert
and find
operations, no remove
at all). Then, the collection count returns like this (2-3 second interval). I've also turned off the balancer. The last lines of the log( the shard I operated on) were all about replica set.
mongos> db.generator_v1_food.find().count()
28279890
mongos> db.generator_v1_food.find().count()
28278067
mongos> db.generator_v1_food.find().count()
28278008
...
What is happening behind the scene? Any pointers would be great.
quote:
Just because you set balancer state to "off" does not mean it's not still running, and finishing cleaning up from the last moveChunk that was performed.
You should be able to see in the config DB in changelog collection when the last moveChunk.commit event was - that's when the moveChunk process committed to documents from some chunk being moved to the new (target) shard. But after that, asynchronously the old shard needs to delete the documents that no longer belong to it. Since the "count" is taken from meta data and does not actually query for how many documents there are "for real" it will double count documents "in flight" during balancing rounds (or any that are not properly cleaned up or from aborted balance attempts).
Asya