Search code examples
terraformamazon-ecsterraform-provider-aws

How do i create a new revision of a task definition instead of creating a new task definition itself using Terraform


I have an AWS cluster which is running multiple tasks, each of these being a revision of the previous one as these revisions are identical except for a path change the docker command passed to them. I wanted to automate this process using Terraform. I want to create a new revision of an existing task, and run a new task in an existing cluster which picks this latest task revision. But right now I am only able to create a new task definition and a new ECS cluster everytime I am running the script. I am not sure what I am missing.

Here are the relevant sections of my script

# main.tf

# Provider configuration
provider "aws" {
  region = "ap-northeast-1"
}

# Read input.yaml file
data "local_file" "input" {
  filename = "input.yaml"
}

# Parse input.yaml file
locals {
  input_data = yamldecode(data.local_file.input.content)
}

# Retrieve existing ECS task definition
data "aws_ecs_task_definition" "existing_task" {
  task_definition = "modbus-simulator-fargate-task"
}

# Create new ECS task definition based on existing one with modifications
resource "aws_ecs_task_definition" "modbus_simulator" {
  family                   = data.aws_ecs_task_definition.existing_task.family
  task_role_arn            = data.aws_ecs_task_definition.existing_task.task_role_arn
  execution_role_arn       = data.aws_ecs_task_definition.existing_task.execution_role_arn
  network_mode             = data.aws_ecs_task_definition.existing_task.network_mode
  requires_compatibilities = ["FARGATE"]  # Add FARGATE here
  cpu                      = 1024      # Specify CPU here
  memory                   = 3072       # Specify memory here

  container_definitions = jsonencode([
    {
      "name": "client-server-simulator",
      "image": "337039605624.dkr.ecr.ap-northeast-1.amazonaws.com/client-server-simulator:26-june-2024",
      "cpu": 0,
      "portMappings": [
        {
          "name": "client-server-simulator-80-tcp",
          "containerPort": 80,
          "hostPort": 80,
          "protocol": "tcp",
          "appProtocol": "http"
        }
      ],
      "essential": true,
      "command": [
        "--s3-path",
        local.input_data[0].s3_path,
        "--modbus-ip",
        local.input_data[0].modbus_ip,
        "--modbus-port",
        local.input_data[0].modbus_port,
        "--interval",
        local.input_data[0].interval,
        "--edge-id",
        local.input_data[0].edge_id,
        "--location",
        local.input_data[0].location,
        "--company",
        local.input_data[0].company
      ],
      "environment": [],
      "mountPoints": [],
      "volumesFrom": [],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/modbus-simulator-fargate-task",
          "awslogs-create-group": "true",
          "awslogs-region": "ap-northeast-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "systemControls": []
    }
  ])
}

# Update ECS service to use the new task definition revision
resource "aws_ecs_service" "modbus_simulator_service" {
  name            = "modbus-simulator-service"
  cluster         = "modbus-simulator-cluster" # use the existing cluster
  task_definition = aws_ecs_task_definition.modbus_simulator.arn
  desired_count   = 1

  # Update only if there's a change in the task definition
  lifecycle {
    ignore_changes = [task_definition]
  }
}

[EDIT] : A comment asked me to mention where I am storing state. If by state you mean the .tfstate file, it is being created and stored locally on my system

I have tried going through the examples provided in the repository.


Solution

  • In the current implementation of aws_ecs_task_definition I notice that the container_definitions attribute is marked as ForceNew.

    That flag is a shortcut offered by Terraform's SDK which makes the SDK logic (working on the provider's behalf) tell Terraform Core that this attribute cannot be changed without destroying the current object and creating a new one to replace it. You can recognize the effect of this behavior in Terraform's plan by the change for that attribute being annotated with # forces replacement.

    With that flag in place, the provider is forcing Terraform Core to treat this change as two separate actions: destroy the existing object, create a new one. That means there's no way that the provider could possibly represent this as an update to the existing object, and I think that would be required in order for the provider to model this as a new revision of the existing object rather than as an entirely new one.

    Therefore unfortunately I'm forced to conclude that what you want to achieve is not currently possible. The AWS provider's modeling of this object type is that its container definitions are immutable once created.

    I think the behavior you want would require a change to the provider's implementation, so you could potentially open a feature request with the provider if there isn't already a similar one open.