Search code examples
amazon-web-servicesamazon-ec2cassandracassandra-3.0cassandra-stress

Autoscaling a Cassandra cluster on AWS


I have been trying to auto-scale a 3node Cassandra cluster with Replication Factor 3 and Consistency Level 1 on Amazon EC2 instances. Despite the load balancer one of the autoscaled nodes has zero CPU utilization and the other autoscaled node has considerable traffic on it.

I have experimented more than 4 times to auto-scale a 3 node with RF3CL1 and the CPU utilization on one of the autoscaling nodes is still zero. The overall CPU utilization has a drop but one of the autoscaled nodes is consistently idle from the point of auto scaling.

Note that the two nodes which are launched at the point of autoscaling are started by the same launch configuration. The two nodes have the same configuration in every aspect. There is an alarm for the triggering of the nodes and the scaling policy is set as per that alarm.

Can there be a bash script that can be run on the user data?

For example, altering the keyspaces?

Can someone let me know what could be the reason behind this behavior?


Solution

  • AWS auto scaling and load balancing is not a good fit for Cassandra. Cassandra has its own built in clustering with seed nodes to discover the other members of the cluster, so there is no need for an ELB. And auto scaling can screw you up because the data has to be re-balanced between the nodes.

    https://d0.awsstatic.com/whitepapers/Cassandra_on_AWS.pdf