Search code examples
amazon-web-servicesaws-cliamazon-cloudwatchamazon-route53cloudwatch-alarms

Unable to create a Route 53 health check with associated CloudWatch alarm


I'm attempting to create a new Route 53 health check with an alarm that should be triggered when the health check fails. I'm using the following command and health check config:

aws route53 create-health-check --caller-reference cw-20240607 \
  --health-check-config file://health_check_config.json
{
  "Type": "HTTPS_STR_MATCH",
  "FullyQualifiedDomainName": "https://api.dev.somedomain.dev",
  "Port": 443,
  "ResourcePath": "/health",
  "SearchString": "\"status\":\"ok\"",
  "EnableSNI": true,
  "AlarmIdentifier": {
    "Name": "regen-api-health-check",
    "Region": "us-east-1"
  }
}

I'm getting the following error:

An error occurred (InvalidInput) when calling the CreateHealthCheck operation:
Invalid parameter : Basic health checks must not have an metric region specified.

I did try removing the region from the AlarmIdentifier, but of course it's a required parameter, so the new error is:

Parameter validation failed:
Missing required parameter in HealthCheckConfig.AlarmIdentifier: "Region"

I found this 6-year-old issue in the aws-sdk-php repo, which indicates you can't specify an AlarmIdentifier with a "basic" health check. However, I can't find any reference in the docs as to what constitutes a "basic" health check, or what config would allow me to specify an alarm identifier.

So, my questions are:

  1. What specifically is a basic health check, vs a.. non-basic health check?
  2. How can I configure the health check with an alarm identifier, so it will trigger my CloudWatch alarm?

FWIW, I tried to do this with CloudFormation first, but the error it kept providing was extremely unhelpful and non-specific:

Resource handler returned message: "Invalid request provided: AWS::Route53::HealthCheck"

Solution

  • I eventually figured out AlarmIdentifier is not for specifying the alarm that the health check should trigger. The correct way to associate the CloudWatch alarm to the health check is to specify the HealthCheckId in the dimensions parameter when creating the alarm. Using the CLI, this looks like:

    aws route53 create-health-check --caller-reference cw-20240607 \
      --health-check-config file://health_check_config.json
    
    aws cloudwatch put-metric-alarm --alarm-name my-api-health-check \
      --namespace AWS/Route53 --metric-name HealthCheckStatus \
      --dimensions "Name=HealthCheckId,Value=<health check ID>" \
      --comparison-operator LessThanThreshold --statistic Average --period 60 \
      --threshold 1 --evaluation-periods 3 --datapoints-to-alarm 3 \
      --alarm-actions "arn:aws:sns:us-east-1:<account ID>:my-alerts"
    

    health_check_config.json contents:

    {
      "Type": "HTTPS_STR_MATCH",
      "FullyQualifiedDomainName": "api.dev.somedomain.dev",
      "Port": 443,
      "ResourcePath": "/health",
      "SearchString": "\"status\":\"ok\"",
      "EnableSNI": true,
      "Regions": ["us-east-1", "us-west-1", "us-west-2"]
    }
    

    Once the fix was apparent, I was able to deploy the health check and alarm via CloudFormation:

    {
      "AWSTemplateFormatVersion": "2010-09-09",
      "Description": "Testing Health Check and Alarm config",
      "Resources": {
        "Route53HealthCheck": {
          "Type": "AWS::Route53::HealthCheck",
          "Properties": {
            "HealthCheckConfig": {
              "Type": "HTTPS_STR_MATCH",
              "FullyQualifiedDomainName": "api.dev.somedomain.dev",
              "Port": 443,
              "ResourcePath": "/health",
              "SearchString": "\"status\":\"ok\"",
              "EnableSNI": true,
              "Regions": ["us-east-1", "us-west-1", "us-west-2"]
            }
          }
        },
        "HealthCheckAlarm": {
          "Type": "AWS::CloudWatch::Alarm",
          "Properties": {
            "AlarmName": "my-api-health-check",
            "Namespace": "AWS/Route53",
            "MetricName": "HealthCheckStatus",
            "ComparisonOperator": "LessThanThreshold",
            "Statistic": "Average",
            "Period": 60,
            "Threshold": 1,
            "DatapointsToAlarm": 3,
            "EvaluationPeriods": 3,
            "AlarmActions": [
              {
                "Fn::Join": [
                  ":",
                  ["arn:aws:sns", { "Ref": "AWS::Region" }, { "Ref": "AWS::AccountId" }, "my-alerts"]
                ]
              }
            ],
            "OKActions": [
              {
                "Fn::Join": [
                  ":",
                  ["arn:aws:sns", { "Ref": "AWS::Region" }, { "Ref": "AWS::AccountId" }, "my-alerts"]
                ]
              }
            ],
            "Dimensions": [
              {
                "Name": "HealthCheckId",
                "Value": { "Fn::GetAtt": ["Route53HealthCheck", "HealthCheckId"] }
              }
            ]
          }
        }
      }
    }