Search code examples
amazon-web-servicesamazon-ecsamazon-cloudwatch

ECS deployment changes target group - how to maintain alarms that depend on target group?


I have a workload running as an ECS service attached to a target group. Then I have an alarm monitoring that target group's instance count (HealthyHostCount). I'd like to implement blue/green deployments using 2 target groups, but it seems like because the alarm monitors a specific target group's value, it needs to be updated every deployment separately from the actual deployment.

This seems fragile and that there would be a better way to do this (e.g. after the deployment if we have a script that updates the alarm's target group, it could fail), but I can't see the better way. Is there an obviously easier solution?


Solution

  • Instead of monitoring you have the desired number of healthy targets, monitor that you have no unhealthy ones. Your ECS service will take care of managing your desired count, plus you might want to scale the service so UnHealthyHostCount is the better metric to alarm on, I think anyway.

    Create one alarm for each target group as below.

    These won't trigger between normal ECS blue/green deployments, only if there is a registered target failing health-checks. You need to tune the health-check settings on the target group and HealthCheckGracePeriodSeconds setting for the ECS service accordingly.

      BlueUnHealthyHostCountAlarm:
        Type: 'AWS::CloudWatch::Alarm'
        Properties:
          AlarmDescription: 'Alarms when there is any unhealthy target'
          Namespace: 'AWS/ApplicationELB'
          MetricName: UnHealthyHostCount
          Statistic: Maximum
          Period: 60
          EvaluationPeriods: 2
          ComparisonOperator: GreaterThanThreshold
          Threshold: 1
          AlarmActions:
          - Topic
          Dimensions:
          - Name: LoadBalancer
            Value: AlbFullName
          - Name: TargetGroup
            Value: BlueTargetGroup
    
      GreenUnHealthyHostCountAlarm:
        Type: 'AWS::CloudWatch::Alarm'
        Properties:
          AlarmDescription: 'Alarms when there is any unhealthy target'
          Namespace: 'AWS/ApplicationELB'
          MetricName: UnHealthyHostCount
          Statistic: Maximum
          Period: 60
          EvaluationPeriods: 2
          ComparisonOperator: GreaterThanThreshold
          Threshold: 1
          AlarmActions:
          - Topic
          Dimensions:
          - Name: LoadBalancer
            Value: AlbFullName
          - Name: TargetGroup
            Value: GreenTargetGroup