Search code examples
amazon-web-servicesapache-sparkkubernetesamazon-eks

Using multiple instance types in node group, but only one actually being used


We're trying to set up a spot node group in EKS with lower and higher capacity instance types, (e.g. instance_types = ["t3.xlarge", "c5.4xlarge"]), but ... only the t3 is used, even if we specify more CPU than it has to offer. Pods still try to use it and just hang.

How do we get the larger instances to come into play?


Solution

  • An AWS AutoScalingGroup has the ability to put weights on the instance types, but that functionality isn't built into EKS. So what's happening is that the ASG is designed to create the first instance type if possible, and doesn't get impacted by your K8s workload requests, and therefor will always be the first type that is available.

    You probably want to create two different node groups (one for the t3.xlarge and another for the c5.4xlarge). And depending on the workloads, maybe allow the min-size to be 0.

    Alternatively, if you want to explicitly change the existing node group and not have two, then maybe these instructions would be useful: https://blog.porter.run/updating-eks-instance-type/