Search code examples
amazon-web-servicesamazon-ec2aws-cloudformationamazon-elbcloud-init

how to wait for cfn-signal before adding a booting EC2 to the ELB


my question is about achieving to wait for cfn-signal before adding a booting EC2 to the ELB (if possible, using cloudformation). For now, the ELB start using the instance without waiting it to be ready, worst, if I do start the apache service only at the end of the "boot", it do not wait for the grace period and remove the instance before it has finish to be initialized.

here is my AutoScalingGroup conf :

"webServerASG": {
  "Type": "AWS::AutoScaling::AutoScalingGroup",

  "CreationPolicy": {
    "ResourceSignal": {
      "Timeout": "PT10M",
      "Count": { "Ref": "WebServerDesiredCapacity" }
    }
  },
  "UpdatePolicy" : {
    "AutoScalingReplacingUpdate" : {
      "WillReplace" : true
    },
    "AutoScalingRollingUpdate" : {
      "MaxBatchSize" : 1,
      "MinInstancesInService" : 1,
      "MinSuccessfulInstancesPercent" : 75,
      "SuspendProcesses" : [ "AlarmNotification", "ScheduledActions", "AZRebalance", "ReplaceUnhealthy", "HealthCheck", "Launch" ],
      "WaitOnResourceSignals": true,
      "PauseTime" : "PT10M"
    }
  },
  "Properties": {
    "AvailabilityZones": [...],
    "Cooldown": "60",
    "HealthCheckGracePeriod": "560",
    "HealthCheckType": "ELB",
    "MaxSize": { "Ref": "WebServerMaxCapacity" },
    "MinSize": "1",
    "DesiredCapacity": { "Ref": "WebServerDesiredCapacity" },
    "VPCZoneIdentifier": [...],
    "NotificationConfigurations": [...],
    "LaunchConfigurationName": { "Ref": "webServer01LC" },
    "LoadBalancerNames": [ { "Ref": "webServerLoadBalancerELB" } ],
    "MetricsCollection": [...],
    "TerminationPolicies": [
      "Default"
    ]
  }
},

with this conf for the LoadBalancer :

"webServerLoadBalancerELB": {
  "Type": "AWS::ElasticLoadBalancing::LoadBalancer",
  "Properties": {
    "Subnets": [...],
    "HealthCheck": {
      "HealthyThreshold": "10",
      "Interval": "30",
      "Target": "HTTP:80/aws/healthcheck.php",
      "Timeout": "5",
      "UnhealthyThreshold": "2"
    },
    "ConnectionDrainingPolicy": {
      "Enabled": "true",
      "Timeout": "300"
    },
    "ConnectionSettings": {
      "IdleTimeout": "60"
    },
    "CrossZone": "true",
    "SecurityGroups": [...],
    "Listeners": [
      {
        "LoadBalancerPort": "80",
        "Protocol": "HTTP",
        "InstancePort": "80",
        "InstanceProtocol": "HTTP"
      }
    ]
  }
}

and for the LaunchConfiguration :

"webServer01LC": {
  "Type": "AWS::AutoScaling::LaunchConfiguration",
  "Properties": {
    "AssociatePublicIpAddress": false,
    "ImageId": {...},
    "InstanceType": "t2.small",
    "KeyName": {...},
    "IamInstanceProfile": "EC2WebServerRole",
    "SecurityGroups": [...],
    "BlockDeviceMappings": [...],
    "UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [
      "#!/bin/bash\n",

      "echo '\n### pip install cloud-init : ' | tee --append /var/log//userData.log \n",
      "pip install https://s3.amazonaws.com/cloudformation-examples/aws-cfn-bootstrap-latest.tar.gz  2>> /var/log//userData.log\n",
      "ln -s /usr/local/init/ubuntu/cfn-hup /etc/init.d/cfn-hup  >> /var/log//userData.log>&1\n",


      "echo '\n### execute the AWS::CloudFormation::Init defined in the Metadata' | tee --append /var/log//userData.log \n",
      "cfn-init -v",
              " --region ", { "Ref" : "AWS::Region" },
              " --stack ", { "Ref" : "AWS::StackName" },
              " --resource lciTophubWebServer01LC ",
              " \n",

      "cfn-signal --exit-code $? --region ", { "Ref" : "AWS::Region" }, " --stack ", { "Ref" : "AWS::StackName" }, " --resource asgiTophubWebServerASG \n",
      "echo '\n### end of the script!' | tee --append /var/log//userData.log \n"
    ]]}}
  },
  "Metadata": {...}
}

Solution

  • I found out my problem : ELB status checks are run without pause, so during the grace period, the tests are running. once the grace period is finished, the autoscaling group immediately look at the current ELB status checks.

    In my case, I had "HealthyThreshold": "10" and "Interval": "30" so it took ELB 10*30=300 sec. to change its minds (even 329.99 if I'm not lucky) and it is way too long compared to the HealthCheckGracePeriod which is only 560 (560 - 329.99 = 230.01 sec. left to have finished starting).

    I shortened this "become healthy again" delay by passing HealthyThreshold at 2 : 560 - 99.99 = 460.01 sec. left to start.