Search code examples
cassandrascylla

How to setup Cassandra clusters


Currently, we have to setup our Cassandra clusters to support two data centers. we have a rough idea that the database clusters can be setup as the following picture.

enter image description here

According to this picture.

Suppose that we have 6 database nodes.

Regarding the bottom one,

1.3 nodes in data center 1.

2.3 nodes in data center 2.

3.Create one cluster, which includes all the nodes.

4.When creating the key space, can use NetWorkToplogyStrategy and replication factor DC1:2, DC2:2 to implement the data replication across data centers.

Regrading the top one,

1.3 nodes in Cluster 1, data center 1.

2.3 nodes in Cluster 2, data center 2.

Now, I have the question, how to set up the replication mechanism between Cluster 1 and Cluster 2 or data center 1 and data cetner 2.

Thanks.


Solution

  • Your bottom picture, from what you've described, is really the cassandra model. One cluster with as many DCs you need. You can configure replication between the DCs at-will (via the keyspaces). If you have 2 separate clusters, they do not replicate to each-other as they're completely independent databases. The bottom picture is how Cassandra works, and is all that you need. Each node is treated as a master (i.e. there are no master nodes in Cassandra - each are equal).

    Just to add a note to your diagram: Having a 3 node DC with a RF=2, while providing some redundancy, takes away one of the most common client consistency options - LOCAL_QUORUM. The reason is because LOCAL_QUORUM of 2 is 2 = ALL - Meaning if you chose that, you have completely sacrificed the availability component for some of the data (meaning all nodes on the "local" DC would have to be available at all times to ensure no errors occur). If you plan on choosing LOCAL_QUORUM as an option, you should change RF=3, allowing for a single node to be unavailable without issues.

    -Jim