Search code examples
aws-lambdaamazon-ecsaws-sdk-nodejsaws-event-bridge

Why does this AWS Lambda function keep looping from this EventBridge rule for AWS ECS?


I have an EventBridge rule where it invokes a Lambda function (target) when one of the ECS Tasks starts (status: RUNNING). The Lambda function does something then at the end it is supposed to stop the ECS Task.

I have the following EventBridge rule:

{
  "source": ["aws.ecs"],
  "detail-type": ["ECS Task State Change"],
  "detail": {
    "clusterArn": ["<cluster-arn>"],
    "lastStatus": ["RUNNING"],
    "desiredStatus": ["RUNNING"]
  }
}

And it invokes the Lambda function.

The following is a simplified version of the Lambda function:

import { ECSClient, StopTaskCommand } from "@aws-sdk/client-ecs";

export const handler = async (event, context, callback) => {
    // event has the EventBridge event capture from the AWS ECS Task
    // Do something
    const ecs = new ECSClient({ region: "<region>" })
    var taskArn = event.detail.containers[0].taskArn;
    var stopTask = new StopTaskCommand({
        cluster: "<cluster-arn>",
        reason: "<reason>",
        task: taskArn
    });
    try {
        const data = await ecs.send(stopTask);
    } catch (error) {
        console.log(error)
    }
}

When I start the ECS Task, the Lambda function is invoked and starts to run. After it finishes what it is supposed to do, it then goes to stop the ECS Task which invoked it. I use the aws-sdk v3 for this where I get the taskArn from the event parameter of the Lambda function. The Lambda function can successfully stop the ECS Task (I receive a 200 HTTP response code from result of the send command). However, the Lambda function is then invoked again and it repeats this forever (I checked the CloudWatch logs for the function).

I am not sure why the Lambda function starts up again as, from what I can tell, the EventBridge rule shouldn't trigger it.


Solution

  • Is not a problem with your Lambda Function. Is the desired behavior of your ECS Service!

    Your ECS Service has a property "desired count", which instructs the service to meet the desired state.

    If your service indicates that it should have 1 replica, then if you kill the task, the service will try to launch a new one until it meets the desired quantity. For this reason you are seeing that it returns to have activity over and over again.

    It is probably not the correct approach to kill the service.

    What I would do is change the "desired count" to zero, without using the lambda function.