Search code examples
amazon-web-servicesterraformamazon-ecs

terraform keeps forcing new resource/force replacement for container definition with default parameters


I am bringing up aws_ecs_task_defintion with following terraform configuration.

I pass local.image_tag as variable to control the deployment of our ecr image through terraform.

I am able to bring up the ecs_cluster on initial terraform plan/apply cycle just fine.

However, on the subsequent terraform plan/apply cycle, terraform is forcing the new container definition and thats why redeploying the entire task definition even though our ecr image local.image_tag remains just same

This behaviour, is causing the unintended task definition recycle without any changes to the ecr image and just terraform forcing values with defaults.

TF Config

resource "aws_ecs_task_definition" "this_task" {
  family                   = "this-service"
  execution_role_arn       = var.this_role
  task_role_arn            = var.this_role
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = 256
  memory                   = var.env != "prod" ? 512 : 1024
  tags                     = local.common_tags
  # Log the to datadog if it's running in the prod account.
  container_definitions = (
    <<TASK_DEFINITION
[
    {
        "essential": true,
        "image": "AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/thisisservice:${local.image_tag}",
        "environment" :[
            {"name":"ID", "value":"${jsondecode(data.aws_secretsmanager_secret_version.this_decrypt.secret_string)["id"]}"},
            {"name":"SECRET","value":"${jsondecode(data.aws_secretsmanager_secret_version.this_decrypt.secret_string)["secret"]}"},
            {"name":"THIS_SOCKET_URL","value":"${local.websocket_url}"},
            {"name":"THIS_PLATFORM_API","value":"${local.platform_api}"},
            {"name":"REDISURL","value":"${var.redis_url}"},
            {"name":"BASE_S3","value":"${aws_s3_bucket.ec2_vp.id}"}
        ],
        "name": "ec2-vp",
        "logConfiguration": {
            "logDriver": "awsfirelens",
            "options": {
                "Name": "datadog",
                "apikey": "${jsondecode(data.aws_secretsmanager_secret_version.datadog_api_key[0].secret_string)["api_key"]}",
                "Host": "http-intake.logs.datadoghq.com",
                "dd_service": "this",
                "dd_source": "this",
                "dd_message_key": "log",
                "dd_tags": "cluster:${var.cluster_id},Env:${var.env}",
                "TLS": "on",
                "provider": "ecs"
            }
        },
        "portMappings": [
            {
                "containerPort": 443,
                "hostPort": 443
            }
        ]
    },
    {
        "essential": true,
        "image": "amazon/aws-for-fluent-bit:latest",
        "name": "log_router",
        "firelensConfiguration": {
            "type": "fluentbit",
            "options": { "enable-ecs-log-metadata": "true" }
        }
    
    }
]
TASK_DEFINITION
)
}



-/+ resource "aws_ecs_task_definition" "this_task" {
              ~ arn                      = "arn:aws:ecs:ca-central-1:AWS_ACCOUNT_ID:task-definition/this:4" -> (known after apply)
              ~ container_definitions    = jsonencode(
                  ~ [ # forces replacement
                      ~ {
                          - cpu              = 0 -> null
                            environment      = [
                                {
                                    name  = "BASE_S3"
                                    value = "thisisthevalue"
                                },
                                {
                                    name  = "THIS_PLATFORM_API"
                                    value = "thisisthevlaue"
                                },
                                {
                                    name  = "SECRET"
                                    value = "thisisthesecret"
                                },
                                {
                                    name  = "ID"
                                    value = "thisistheid"
                                },
                                {
                                    name  = "THIS_SOCKET_URL"
                                    value = "thisisthevalue"
                                },
                                {
                                    name  = "REDISURL"
                                    value = "thisisthevalue"
                                },
                            ]
                            essential        = true
                            image            = "AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/this:v1.0.0-develop.6"
                            logConfiguration = {
                                logDriver = "awsfirelens"
                                options   = {
                                    Host           = "http-intake.logs.datadoghq.com"
                                    Name           = "datadog"
                                    TLS            = "on"
                                    apikey         = "thisisthekey"
                                    dd_message_key = "log"
                                    dd_service     = "this"
                                    dd_source      = "this"
                                    dd_tags        = "thisisthetags"
                                    provider       = "ecs"
                                }
                            }
                          - mountPoints      = [] -> null
                            name             = "ec2-vp"
                          ~ portMappings     = [
                              ~ {
                                    containerPort = 443
                                    hostPort      = 443
                                  - protocol      = "tcp" -> null
                                },
                            ]
                          - volumesFrom      = [] -> null
                        } # forces replacement,
                      ~ {
                          - cpu                   = 0 -> null
                          - environment           = [] -> null
                            essential             = true
                            firelensConfiguration = {
                                options = {
                                    enable-ecs-log-metadata = "true"
                                }
                                type    = "fluentbit"
                            }
                            image                 = "amazon/aws-for-fluent-bit:latest"
                          - mountPoints           = [] -> null
                            name                  = "log_router"
                          - portMappings          = [] -> null
                          - user                  = "0" -> null
                          - volumesFrom           = [] -> null
                        } # forces replacement,
                    ]
                )
                cpu                      = "256"
                execution_role_arn       = "arn:aws:iam::AWS_ACCOUNTID:role/thisistherole"
                family                   = "this"
              ~ id                       = "this-service" -> (known after apply)
                memory                   = "512"
                network_mode             = "awsvpc"
                requires_compatibilities = [
                    "FARGATE",
                ]
              ~ revision                 = 4 -> (known after apply)
                tags                     = {
                    "Cluster"      = "this"
                    "Env"          = "this"
                    "Name"         = "this"
                    "Owner"        = "this"
                    "Proj"         = "this"
                    "SuperCluster" = "this"
                    "Terraform"    = "true"
                }
                task_role_arn            = "arn:aws:iam::AWS_ACCOUNT+ID:role/thisistherole"
            }

Above is the terraform plan that is forcing new task definition/container definition.

As you can see , terraform is replacing all default values with null or empty. I have double check the terraform.tfstate file it already generated from the previous run and those values are exactly the same as its showing on the above plan.

I am not sure why this unintended behaviour is happening and want to have some clues on how to fix this.

I am using terraform 0.12.25 and latest terraform aws provider.


Solution

  • There is a known terraform aws provider bug for this issue.

    In order to make terraform not replace the running task / container definition, I have to fill out all the default values that its showing on terraform plan with either null or empty sets of configuration.

    Once all the parameters are filled out, I ran the terafform plan/apply cycle again to ensure its not replacing the container definition like it was doing it before.