Does anyone have any thoughts on putting together a fleet of Cassandra instances (including seeds) that rely solely on AWS Spot Instances and Elastic IP addresses. Keeping in mind this is a personal POC project, I'm trying to do it as cost effectively as possible.
If the cluster is 2 seed nodes and 4 non seed nodes could I create something that resembles the following for seed nodes:
- Use 2 separate Auto Scaling Groups (ASG) with max, min 1 and auto assign a Elastic IP probably via userdata on startup.
- Seed nodes with a higher spot price than the non seed nodes
- Seed nodes always start off with a publicly assigned IP address so they can route to perform API calls and initiate a Associate EIP to itself.
- A seed node is exactly like a non-seed node except that it has the EIP script associated with it
for non-seed nodes
- Auto Scaling Groups with min count at 4 and desired level at 4
- Set their seed IP addresses in the cassandra.yaml file to point at the elastic IP addresses.
and for starter seed nodes
- The first couple of seed nodes to be created will probably be done outside of the ASG to kick off the process called the starter seed nodes.
- Once those starter seed nodes are setup and talking, I plan to spawn the 2 seed nodes ASGs that will reassign the EIPs and take over the role as seed nodes.
- Destroy the starter seed nodes once the ASG seed nodes have taken over.
I'm familiar with AWS and the scripting to make that happen, but I am very new to Cassandra so:
- Is my proposal possible?
- Am I missing some glaring technological limitations with Cassandra that will cause problems in the future?
- Will this works with DataStax OpsCenter?
- Will the cleanup of old nodes automatically happen when ASG's scale up (or down)?
- When a new seed node comes online in the future, will the reassigning of EIP's to itself mess with its ability to sync with the cluster?
Things I have considered
- If the entire fleet fails I plan to run Netflix Priam to keep backups at 30 minutes.
- It will be rolled out to multi AZ's and regions if it works in this poc.
- In production I would keep the config identical but run them with ondemand allocation
Thanks for your help for any reference material to make this happen.
Cheers