Search code examples
amazon-web-servicesamazon-ec2aws-lambdaamazon-ecs

Do not terminate EC2 instance if Lifecycle hook timeout is reached


Need to let EC2 instance run if lambda draining mechanism for terminating instance doesn't finish within HeartbeatTimeout parameter set in cloudformation for lifecycle hook.

I have a lambda function that drains an EC2 instance and terminates it when a Scale down alarm is triggered in cloudformation. I currently use LifeCycle hooks to terminate the instance in my cloudformation. However, I understand that there is a HeartbeatTimeout parameter for the lifecycle hook that kills the instance when the lambda draining mechanism doesn't finish within this period. I do not want to kill the instance if the lambda is not able to drain the instance within the HeartbeatTimeout since there are still tasks running on this. I'd like to abort instance termination and let the instance run in this case, Is there any way to do this?

Here is the lifecyclehook in cloudformation

"Terminationhook": {
    "Type": "AWS::AutoScaling::LifecycleHook",
    "Properties": {
      "AutoScalingGroupName": { "Ref": "Cluster" },
      "DefaultResult": "ABANDON",
      "HeartbeatTimeout": "3600",
      "LifecycleTransition": "autoscaling:EC2_INSTANCE_TERMINATING",
      "NotificationTargetARN": { "Ref" : "SNSTOPIC"},
      "RoleARN": {
        "Fn::GetAtt": [
          "Role",
          "Arn"
        ]
      }
    },
    "DependsOn": "SNSTOPIC"
  }

If the lambda doesn't drain the instance within the HeartbeatTimeout of 3600 seconds, then I want to abort instance termination


Solution

  • From Amazon EC2 Auto Scaling Lifecycle Hooks - Amazon EC2 Auto Scaling:

    Keeping Instances in a Wait State

    Instances can remain in a wait state for a finite period of time. The default is one hour (3600 seconds). You can adjust this time in the following ways:

    • Set the heartbeat timeout for the lifecycle hook when you create the lifecycle hook. With the put-lifecycle-hook command, use the --heartbeat-timeout parameter. With the PutLifecycleHook operation, use the HeartbeatTimeout parameter.
    • Continue to the next state if you finish before the timeout period ends, using the complete-lifecycle-action command or the CompleteLifecycleAction operation.
    • Restart the timeout period by recording a heartbeat, using the record-lifecycle-action-heartbeat command or the RecordLifecycleActionHeartbeat operation. This increments the heartbeat timeout by the timeout value specified when you created the lifecycle hook. For example, if the timeout value is one hour, and you call this command after 30 minutes, the instance remains in a wait state for an additional hour, or a total of 90 minutes.

    The maximum amount of time that you can keep an instance in a wait state is 48 hours or 100 times the heartbeat timeout, whichever is smaller.

    Bottom line: If you need more time, you can restart the timeout period by recording a heartbeat.