amazon-web-services amazon-ec2 cloud amazon-ecs

Why are all requests to ECS container going to only 1 (of 2) EC2 instances in AWS?

In AWS, I have an ECS cluster that contains a service that has 2 EC2 instances. I sent 3 separate API requests to this service, each should take about an hour to run at 100% capacity. I sent the requests a couple minutes apart. They all went to the same instance and left the other open. Here's a graph of CPU utilization Here's an image of my Service CPU Utilization. It is not using all it's bandwidth: What am I missing? Why won't requests go to the second EC2 instance

Solution

An ALB will not perfectly Round-Robin between two instances. If you sent 100 requests, 100 times, then on average each instance would receive 50 requests each, but most of the time it won't be 50 exactly for each backend.

For a long running task like this it is preferable to use something else such as SQS, whereby each container will only process x messages at a time (most of the time you'd want x=1). Each instance can then poll SQS for the work, and wont take more work whilst it is busy.

You will receive other benefits too such as being able to see how long a message is taking to finish, and error handling capabilities to account for timeouts or if a server were to die whilst it is doing work.