Search code examples
amazon-web-servicesterraformamazon-ecsamazon-elb

AWS ALB-ECS 503 Service Unavailable


I am trying to set up a simple nginx webserver on ECS with an ALB to balance traffic, but I get a 503 when trying to access the Load Balancer URL. I've double checked my security groups and vpc settings. What I've found is that my target-group doesn't have any healthy targets. I'm setting this up using terraform. Here's the ALB code -

## Create a simple ALB
resource "aws_lb" "simple-alb" {
  name               = "simple-alb"
  load_balancer_type = "application"
  subnets            = var.alb_subnet_ids
  security_groups    = var.alb_security_group_ids
}

## Create a listener for the ALB
resource "aws_alb_listener" "simple-listener" {
  load_balancer_arn = aws_lb.simple-alb.arn
  port              = "80"
  protocol          = "HTTP"
  default_action {
    target_group_arn = aws_lb_target_group.simple-target-group.arn
    type             = "forward"
  }
  depends_on = [aws_lb.simple-alb, aws_lb_target_group.simple-target-group]
}

## Create a target group for the ALB
resource "aws_lb_target_group" "simple-target-group" {
  name       = var.alb_target_group_name
  port       = 80
  protocol   = "HTTP"
  vpc_id     = var.alb_vpc.id
  depends_on = [aws_lb.simple-alb]
  health_check {
    path                = "/"
    healthy_threshold   = 2
    unhealthy_threshold = 10
    timeout             = 60
    interval            = 300
    matcher             = "200,301,302"
  }
}

Here's the ECS task definition -

## Simple service
resource "aws_ecs_service" "simple-service" {
  name            = var.service_name
  cluster         = aws_ecs_cluster.simple-cluster.arn
  task_definition = aws_ecs_task_definition.simple-task.arn
  desired_count   = 2
  iam_role        = aws_iam_role.ecs-service-role.arn
  depends_on      = [aws_iam_role_policy_attachment.ecs-service-attach, aws_iam_role.ecs-service-role, aws_ecs_cluster.simple-cluster, aws_ecs_task_definition.simple-task]

  load_balancer {
    target_group_arn = var.alb_target_group.arn
    container_name   = var.container_name
    container_port   = "80"
  }
}

Task -

[
    {
        "name": "simple-nginx-server",
        "image": "nginx:latest",
        "cpu": 128,
        "memory": 256,
        "essential": true,
        "portMappings": [
            {
                "containerPort": 80,
                "protocol": "tcp",
                "hostPort": 80
            }
        ],
        "logConfiguration": {
            "logDriver": "awslogs",
            "options": {
                "awslogs-group": "/ecs-logs/cluster-logs",
                "awslogs-region": "eu-west-2",
                "awslogs-stream-prefix": "ecs"
            }
        }
    }

I have verified the vars are correct and as you can see I am setting up the correct target group here. But I still get this specific error I get and I don't see why -

PS D:\Code> curl simple-alb-1310900784.us-east-1.elb.amazonaws.com
curl : 503 Service Temporarily Unavailable
At line:1 char:1
+ curl simple-alb-1310900784.us-east-1.elb.amazonaws.com
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebException
    + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

Edit 1 - Adding security groups -

## Create a public Security Group
resource "aws_security_group" "alb-public-security-group" {
  vpc_id = aws_vpc.simple-vpc.id
  name   = "simple-public-sg"
  # allow public access on port 80
  ingress {
    protocol    = "tcp"
    from_port   = 80
    to_port     = 80
    cidr_blocks = ["0.0.0.0/0"]
  }
  # allow public access on port 443
  ingress {
    protocol    = "tcp"
    from_port   = 443
    to_port     = 443
    cidr_blocks = ["0.0.0.0/0"]
  }
  # allow public egress
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  depends_on = [aws_vpc.simple-vpc]
}

## Create security group for EC2 instances
## Allow access from Public ALB Security Group to the instances
resource "aws_security_group" "instance-private-security-group" {
  vpc_id = aws_vpc.simple-vpc.id
  name   = "simple-instance-sg"
  # Allow access from load balancer security group on port 80
  ingress {
    protocol  = "tcp"
    from_port = 80
    to_port   = 80
    security_groups = [
      aws_security_group.alb-public-security-group.id
    ]
  }
  # Allow access from load balancer security group on port 443
  ingress {
    protocol  = "tcp"
    from_port = 443
    to_port   = 443
    security_groups = [
      aws_security_group.alb-public-security-group.id
    ]
  }
  # block public egress
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  depends_on = [aws_vpc.simple-vpc, aws_security_group.alb-public-security-group]
}

Solution

  • I was able to fix this. The issue was the containers were not starting up due to a misconfigured log group.