I'm new to AWS and I'm trying to provision an ECS cluster with a capacity provider via Terraform. My plan executes without errors currently, and I can see that the capacity provider creates my instances, but those instances are not being registered with the cluster, even though the provider can be seen in the cluster's edit page in the web console.
Here is my config for the cluster:
resource "aws_ecs_cluster" "cluster" {
name = "main"
depends_on = [
null_resource.iam_wait
]
}
data "aws_ami" "amazon_linux_2" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-ecs-hvm-*-x86_64-ebs"]
}
}
resource "aws_launch_configuration" "cluster" {
name = "cluster-${aws_ecs_cluster.cluster.name}"
image_id = data.aws_ami.amazon_linux_2.image_id
instance_type = "t2.small"
security_groups = [module.vpc.default_security_group_id]
iam_instance_profile = aws_iam_instance_profile.cluster.name
}
resource "aws_autoscaling_group" "cluster" {
name = aws_ecs_cluster.cluster.name
launch_configuration = aws_launch_configuration.cluster.name
vpc_zone_identifier = module.vpc.private_subnets
min_size = 3
max_size = 3
desired_capacity = 3
tag {
key = "ClusterName"
value = aws_ecs_cluster.cluster.name
propagate_at_launch = true
}
tag {
key = "AmazonECSManaged"
value = ""
propagate_at_launch = true
}
}
resource "aws_ecs_capacity_provider" "cluster" {
name = aws_ecs_cluster.cluster.name
auto_scaling_group_provider {
auto_scaling_group_arn = aws_autoscaling_group.cluster.arn
managed_scaling {
status = "ENABLED"
maximum_scaling_step_size = 1
minimum_scaling_step_size = 1
target_capacity = 3
}
}
}
resource "aws_ecs_cluster_capacity_providers" "cluster" {
cluster_name = aws_ecs_cluster.cluster.name
capacity_providers = [aws_ecs_capacity_provider.cluster.name]
default_capacity_provider_strategy {
base = 1
weight = 100
capacity_provider = aws_ecs_capacity_provider.cluster.name
}
}
The instance profile role has this policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeTags",
"ecs:CreateCluster",
"ecs:DeregisterContainerInstance",
"ecs:DiscoverPollEndpoint",
"ecs:Poll",
"ecs:RegisterContainerInstance",
"ecs:StartTelemetrySession",
"ecs:Submit*",
"ecr:GetAuthorizationToken",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"ecr:BatchCheckLayerAvailability",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "*"
}
]
}
I've read that this can happen if the instances do not have the proper roles, but as far as I can tell I've set up my roles correctly. I'm not getting any visible permission errors that I can find.
Another strange thing I've seen is that if another cluster named "default" exists, then the instances will register themselves to that cluster, even though the capacity provider is still attached to the other cluster.
Figured it out! I just had to set user_data
like below in my launch configuration.
resource "aws_launch_configuration" "cluster" {
name = "cluster-${aws_ecs_cluster.cluster.name}"
image_id = data.aws_ami.amazon_linux_2.image_id
instance_type = "t2.small"
security_groups = [module.vpc.default_security_group_id]
iam_instance_profile = aws_iam_instance_profile.cluster.name
user_data = "#!/bin/bash\necho ECS_CLUSTER=${aws_ecs_cluster.cluster.name} >> /etc/ecs/ecs.config"
}