Search code examples
viewcouchbasecouchbase-bucketnosql

Does a bucket span over all nodes in a Couchbase Server cluster?


I've been reading through the Couchbase Server documentation, and as I understand it, this is "how it all works":

  • A cluster has one or more nodes (servers).
  • A cluster has one or more buckets.
  • A bucket has one or more views.

My questions:

  1. I would assume that the data in a bucket is distributed over all the nodes in a cluster, correct? Or is it replicated over all nodes?
  2. Assuming that a bucket spans over several nodes in a cluster, does a view retrieve data from all these nodes?
  3. Or is a bucket and its views specific to a certain node?

Solution

  • Your assumption in point one is mostly correct. A bucket is sharded across 1024 vBuckets. These vBuckets are then distributed across the nodes in the cluster (evenly, give or take a remainder), with replica vBuckets being placed on nodes separate to those the master vBuckets are on. By default, a vbucket will only replicate to one other node (and hence each document will replicate to one other node), however you can configure multiple replicas for greater availability if required.

    A view (design doc) will index the data for a particular bucket across every node/vBucket, but the index data for that view are stored in the node the vBuckets are on. So when you query a view, it has to go to every node in the cluster. When you rebalance, by default, the index on a node is changed as vBuckets are migrated off. The data is removed from the source node and regenerated on the target node of replication.

    A good overview of Couchbase Server's sharding architecture is given in the How-To NoSQL 3.0 Webinar on YouTube.