Search code examples
amazon-web-servicesterraformamazon-ecsterraform-provider-aws

ALB Health checks Targets Unhealthy


I am trying to provision an ECS cluster using Terraform along with an ALB. The targets come up as Unhealthy. The error code is 502 in the console Health checks failed with these codes: [502] I checked through the AWS Troubleshooting guide and nothing helped there.

EDIT: I have no services/tasks running on the EC2 containers. Its a vanilla ECS cluster.

Here is my relevant code for the ALB:

# Target Group declaration 

resource "aws_alb_target_group" "lb_target_group_somm" {
  name                 = "${var.alb_name}-default"
  port                 = 80
  protocol             = "HTTP"
  vpc_id               = "${var.vpc_id}"
  deregistration_delay = "${var.deregistration_delay}"
  health_check {
    path     = "/"
    port     = 80
    protocol = "HTTP"
  }

  lifecycle {
    create_before_destroy = true
  }

  tags = {
    Environment = "${var.environment}"
  }

  depends_on = ["aws_alb.alb"]
}

# ALB Listener with default forward rule

resource "aws_alb_listener" "https_listener" {
  load_balancer_arn = "${aws_alb.alb.id}"
  port              = "80"
  protocol          = "HTTP"

  default_action {
    target_group_arn = "${aws_alb_target_group.lb_target_group_somm.arn}"
    type             = "forward"
  }
}

# The ALB has a security group with ingress rules on TCP port 80 and egress rules to anywhere. 
# There is a security group rule for the EC2 instances that allows ingress traffic to the ECS cluster from the ALB: 

resource "aws_security_group_rule" "alb_to_ecs" {
  type                     = "ingress"
  /*from_port                = 32768 */
  from_port                = 80
  to_port                  = 65535
  protocol                 = "TCP"
  source_security_group_id = "${module.alb.alb_security_group_id}"
  security_group_id        = "${module.ecs_cluster.ecs_instance_security_group_id}"
}

Has anyone hit this error and know how to debug/fix this ?


Solution

  • It looks like you're trying to be register the ECS cluster instances with the ALB target group. This isn't how you're meant to send traffic to an ECS service via an ALB.

    Instead you should have your service join the tasks to the target group. This will mean that if you are using host networking then only the instances with the task deployed will be registered. If you are using bridge networking then it will add the ephemeral ports used by your task to your target group (including allowing for there to be multiple targets on a single instance). And if you are using awsvpc networking then it will register the ENIs of every task that the service spins up.

    To do this you should use the load_balancer block in the aws_ecs_service resource. An example might look something like this:

    resource "aws_ecs_service" "mongo" {
      name            = "mongodb"
      cluster         = "${aws_ecs_cluster.foo.id}"
      task_definition = "${aws_ecs_task_definition.mongo.arn}"
      desired_count   = 3
      iam_role        = "${aws_iam_role.foo.arn}"
    
      load_balancer {
        target_group_arn = "${aws_lb_target_group.lb_target_group_somm.arn}"
        container_name   = "mongo"
        container_port   = 8080
      }
    }
    

    If you were using bridge networking this would mean that the tasks are accessible on the ephemeral port range on the instances so your security group rule would need to look like this:

    resource "aws_security_group_rule" "alb_to_ecs" {
      type                     = "ingress"
      from_port                = 32768 # ephemeral port range for bridge networking tasks
      to_port                  = 60999 # cat /proc/sys/net/ipv4/ip_local_port_range
      protocol                 = "TCP"
      source_security_group_id = "${module.alb.alb_security_group_id}"
      security_group_id        = "${module.ecs_cluster.ecs_instance_security_group_id}"
    }