How are people handling connection loss through downscaling in amazon auto scaling group?

As I understand it, when Amazon auto scaling groups downscale, any connections open to the terminated instance are just lost - there is no graceful termination.

I'm wondering how others are handling this.

My thinking is that the initiator of the connection should handle the failure as it should be able to deal with the situation where an instance fails rather than being deliberately terminated.

Any thoughts?

Thanks,

Pete

Solution

The way I did it is with a lifecycle hook. Which can interrupt the termination process for a set amount of time (default 1 hour).

It is designed to be resumed once your work is complete but the timeout worked for a hacky connection draining.

You have the option of adding a hook to your Auto Scaling group instances in this state into a Terminating:Wait state. This state allows you to access these instances before they are terminated.

source: http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroupLifecycle.html

con: setup via CLI, but not too bad.

How to do that: http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/adding-lifecycle-hooks.html

When creating IAM you will need this policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "autoscaling:PutLifecycleHook",
        "autoscaling:DeleteLifecycleHook",
        "autoscaling:RecordLifecycleActionHeartbeat",
        "autoscaling:CompleteLifecycleAction",
        "autoscaling:DescribeAutoscalingGroups",
        "autoscaling:DescribeLifecycleHooks",
        "autoscaling:PutInstanceInStandby",
        "autoscaling:PutInstanceInService",
        "iam:AddRoleToInstanceProfile",
        "iam:CreateInstanceProfile",
        "iam:CreateRole",
        "iam:PassRole",
        "iam:ListInstanceProfiles",
        "ec2:Describe*"
      ],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}

Good luck!