Search code examples
cassandranodetool

Cassandra 2.1 changing snitch from EC2Snitch to GossipingPropertyFileSnitch


Currently we have used EC2Snitch using two AZs in a single AWS region. The goal was to provide resiliency even when one AZ is not available. Most data are replicated with RF=2, so each AZ gets a copy based on Ec2Snitch.

Now we have come to a conclusion to move to GossipingPropertyFileSnitch. Reason primarily is that we have realized that one AZ going down is a remote occurrence and even if it happens, there are other systems in our stack that don't support this; so eventually whole app goes down if that happens.

Other reason is that with EC2Snitch and two AZs, we had to scale in factor of 2 (one in each AZ). With GossipingPropertyFileSnitch using just one rack, we can scale in factor of 1.

When we change this snitch setting, will the topology change? I want to avoid having a need to run nodetool repair. We always had failures with running nodetool repair and it runs forever.


Solution

  • Whether the topology changes depends on how you carry out the change. If you assign the same logical dc and rack to the node as what it's currently configured to, you shouldn't get a topology change.

    You have to match the rack to the AZ after updating to GossipingPropertyFileSnitch. You need to do a rolling restart for the re-configuration to take place.

    Example cassandra-rackdc.properties for 2 nodes in 1 dc across 2 AZs:

    # node=10.0.0.1, dc=first, AZ=1
    dc_suffix=first
    # Becomes
    dc=first
    rack=1
    
    # node=10.0.0.2, dc=first, AZ=2
    dc_suffix=first
    # Becomes
    dc=first
    rack=2
    

    On a side note you need to explore why repairs are failing. Unfortunately they are very important for cluster health.