Search code examples
selenium-webdriveramazon-ecsaws-fargate

ECS Fargate: return the essential container(s) exit code


Explanation

I am currently using Fargate on Amazon ECS. In my setup, I have a task definition that includes two containers:

  1. Python code container: Runs the main application logic.
  2. Ephemeral service container: Supports the Python code by providing a temporary service that the Python code relies on (Selenium-hub).

Problem

I want the following behavior:

  • When the Python code completes its execution, it should exit cleanly.
  • Simultaneously, the ephemeral service container should be terminated.

However, I'm running into an issue with how ECS handles the essential property in the task definition. When I set essential to true for the Python container, ECS sends a SIGTERM signal to the ephemeral service container once the Python container finishes. This results in the ephemeral service container returning a non-zero exit code, which is not desirable.

Desired Behavior

I need ECS to logically AND the return codes of only the essential containers, meaning that the non-essential containers' exit codes should not affect the overall task status. Specifically, I want the task to only consider the exit code of the Python container, ignoring the exit code of the ephemeral service container when determining if the task succeeded or failed.

Question

Is there any way or tricky solution to overcome this issue?


Solution

  • As @MedAgou mentioned, there isn't a straightforward way to handle this specific scenario (i.e., managing signals or modifying the exit code in a non-essential sidecar container).

    Here are two approaches to resolve this issue when using selenium-hub and Python code containers:

    1. Creating a Custom Image: You could create a custom Docker image that combines the functionality of both containers. However, this approach increases system complexity and creates a tightly coupled codebase, which is generally not recommended.

    2. Modifying the Entry Point of the Selenium Hub Container: A more flexible solution involves modifying the entry point of the selenium-hub container. I explored this repository and adapted the entry_point.sh script as follows:

      #!/usr/bin/env bash
      
      NODE_CONFIG_DIRECTORY=${NODE_CONFIG_DIRECTORY:-"/opt/bin"}
      #==============================================
      # OpenShift or non-sudo environments support
      # https://docs.openshift.com/container-platform/3.11/creating_images/guidelines.html#openshift-specific-guidelines
      #==============================================
      
      if ! whoami &> /dev/null; then
        if [ -w /etc/passwd ]; then
          echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${HOME}:/sbin/nologin" >> /etc/passwd
        fi
      fi
      
      /usr/bin/supervisord --configuration /etc/supervisord.conf &
      
      SUPERVISOR_PID=$!
      
      function shutdown {
          echo "The following line does the trick!"
          exit 0
      }
      
      trap shutdown SIGTERM SIGINT
      wait ${SUPERVISOR_PID}
      

      Dockerfile:

      ARG IMAGE_TAG=4.23.1-20240820
      FROM selenium/standalone-chrome:${IMAGE_TAG}
      
      LABEL maintainer="Mostafa"
      
      COPY modified-entry-point.sh /opt/bin/modified-entry-point.sh
      RUN chmod +x /opt/bin/modified-entry-point.sh
      ENTRYPOINT ["/opt/bin/modified-entry-point.sh"]
      

    This script ensures that when the Python container exits, the selenium-hub container handles the termination gracefully without causing a non-zero exit code. This approach keeps your services loosely coupled and maintains the desired behavior in ECS.