Search code examples
cassandracassandra-2.0cassandra-2.1

How can we understand the concept of replication factor in cassandra?


What is replication factor in cassandra and how does it affect single DC or multiple DC nodes ?


Solution

  • Cassandra stores replicas on multiple nodes to ensure reliability and fault tolerance. The total number of replicas across the cluster is referred to as the replication factor. A replication factor of 1 means that there is only one copy of each row on one node. A replication factor of 2 means two copies of each row, where each copy is on a different node. All replicas are equally important; there is no primary or master replica

    When creating keyspace, you need to specify the replication factor on each DC.

    Example Single DC with SimpleStrategy:

    CREATE KEYSPACE Excelsior WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };
    
    • Here we specify replication_factor 3 means, Each row will be placed on three different node.

    Example Multi DC :

    CREATE KEYSPACE Excalibur WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2};
    
    • This example sets three replicas for a data center named dc1 and two replicas for a data center named dc2

    Source : https://docs.datastax.com/en/cassandra/2.1/cassandra/architecture/architectureDataDistributeReplication_c.html