I want to implement Sharding for my MongoDb and need some of your suggestions.
Insight
After a research we found we will use sharding and make it useful with multiple servers which guarantees two things:
Question 1: My problem is to find a correct shard-key to partition the data. I don't see a unique-key in the collection other than the default ObjectId. After further reading I have found that it is possible to use a composite key, does it make sense to have a composite key or custom ObjectId as a key where the value might look like ObjectId: _. This is very key with respect to performance of returning the results of a query & moving the chunks.
Question 2: Since we have large collections, it will become difficult to set the shard each time in Mongo console when a collection is created dynamically. Is there any way to make it automatic in mongo so that whenever a collection is created for a shard-database, it will define the shard-key for that collection?
Question 3: Is it really necessary to pass shard-key to the query expression? I don't think we have used ObjectId in any of our query-expression, I doubt I can come with a unique ID due to fact that the data is not structured like a traditional DB. If yes, how is it going to help for a query like this:
Example:
{ category: "Energy", subcategory: "Watt", Process-Start-Time: {$gte: 132234234}}
Thanks in advance for stepping in and helping me fix this problem.
The easiest way to do this might be to shard the database, but leave the collections unsharded. Benefits:
The downside is that all traffic for a collection will go to its shard, which might be impractical for what you're trying to do.
As far as your questions:
Question 1: Shard key does not have to be unique. What are you generally querying for? You might be better of with something like {category:1}
or {category:1,subcategory:1}
.
Question 2: No built-in way to do it automatically, the best way to get that behavior is probably to set up a cron job.
Question 3: No. Queries containing the shard key can be sent to specific shards and queries without the shard key must be sent to all shards, see http://www.mongodb.org/display/DOCS/Sharding+Introduction#ShardingIntroduction-OperationTypes.