Search code examples
amazon-web-servicesamazon-elbamazon-ecs

AWS - Load balancing for ECS service with hard connections limit per container


I have a container deployed on ECS Fargate as a service. The container should serve long HTTP Websocket connections and perform real time processing. Each connection can live from few minutes to few hours in different use cases.

Each container can serve up to constant amount of connections simultaneously (e.g max 10 connection) to be able to process to input in real-time.

AWS Application Load balancer is at the front of this service. On regular autoscaling rules - containers number can be scaled out or down by monitoring CPU. This Application Load balancer is using round robin routing algo for each incoming request.

My question :

Having the requirement of constant HARD limit of connections per container, how can I enforce ALB not to route new connection to a container with no available connection slot?

The service itself inside the container - can it tell ALB that it is closed for new connections? By specific HTTP response maybe?

Is there any other good practice to handle this requirement?


Solution

  • You will need to write your own code for this.

    A possible solution is to combine:

    • Auto Scaling
    • Lifecycle hooks
    • Container Instance Draining.

    Your code will need to detect how many connections it is processing. When the number hits your limit of 10, remove the container from the auto scaling group. By using Lifecycle hooks, you can keep the container alive. Once your 10 connections reach 0, complete the termination of the container.

    Note this will cause a new container to be launched while you are draining the container that has reached its peak.

    I don't know of another method to tell the ALB to stop sending traffic to a specific container without removing it. They key is the draining and termination lifecycle part as you want the container to continue to have its connections to the client.