My understanding of couchDB's clustering approach is to ensure that each node in the cluster receives an equal distribution of data, by hashing a document's unique Id. My question is if there is a way to change this and define a custom key for "intelligently" routing a document to a specific node in the cluster?
In my scenario, I have data which relates to a specific entity (think client-project-task-item) Across all my data; I will have enough items to require some horizontal scaling; however, each search will always relate to a given client-project-task for which the data set is only a moderate size.
I think the most efficient approach will be to partition my data by client-project-task and pre-allocate say 1000 partitions.
I understand that at a certain point this will limit my scaling capacity, but the trade off of not having to hit every partition for every search makes it one I'm willing to pay.
So is there a way to create this type of partitioning logic in CouchDb?
Thanks; Brent
As mentioned in comments, CouchDB does not have builtin support for sharding, yet. However, with the ongoing BigCouch merge and release of CouchDB 2.0, there will be. The source of that is coming from Cloudant, so you should be able to get some understanding of it as of now by reading into their documentation and whitepapers and info about BigCouch.
In the latest weekly news from CouchDB, there's a mention of soon writing about this in the docs: http://blog.couchdb.org/2014/08/14/couchdb-weekly-news-august-14-2014/