Search code examples
amazon-web-servicesaws-batchaws-ecs

ECS unable to assume role


From the console, I am invoking a lambda which submits a batch job. The batch job fails, indicating that ECS is unable to assume the role that is provided to execute the job definition.

For the role, I've added the lambda and ECS services.

The error message:

"ECS was unable to assume the role 'arn:aws:iam::749340585813:role/golfnow-invoke-write-progress' that was provided for this task. Please verify that the role being passed has the proper trust relationship and permissions and that your IAM user has permissions to pass this role."

"TrainingJobRole": {
  "Type": "AWS::IAM::Role",
  "Properties": {
    "RoleName": "golfnow-invoke-write-progress",
    "AssumeRolePolicyDocument": {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Service": [
              "lambda.amazonaws.com",
              "ecs.amazonaws.com"
            ]
          },
          "Action": [
            "sts:AssumeRole"
          ]
        }
      ]
    },
    "Path": "/"
  }
}

The batch job:

    "TrainingJob": {
  "Type": "AWS::Batch::JobDefinition",
  "Properties": {
    "Type": "container",
    "JobDefinitionName": {
      "Fn::Sub": "c12e-golfnow-${Environment}-job"
    },
    "ContainerProperties": {
      "Image": {
        "Fn::Join": [
          "",
          [
            "{{ image omitted }}",
            {
              "Ref": "AWS::Region"
            },
            ".amazonaws.com/amazonlinux:latest"
          ]
        ]
      },
      "Vcpus": 2,
      "Memory": 2000,
      "Command": [
        "while", "True", ";", "do", "echo", "'hello';", "done"
      ],
      "JobRoleArn": {
        "Fn::GetAtt": [
          "TrainingJobRole",
          "Arn"
        ]
      }
    },
    "RetryStrategy": {
      "Attempts": 1
    }
  }
},
"JobQueue": {
  "Type": "AWS::Batch::JobQueue",
  "Properties": {
    "Priority": 1,
    "ComputeEnvironmentOrder": [
      {
        "Order": 1,
        "ComputeEnvironment": {
          "Ref": "ComputeEnvironment"
        }
      }
    ]
  }
}

Is the issue with the way it's being invoked? My user has admin privileges, so I don't think this is an issue with my user having insufficient permissions.


Solution

  • You have to add the principal "ecs-tasks.amazonaws.com" to the trust policy for the role that's submitting a Batch job (not "ecs.amazonaws.com").

    Revised role:

    "TrainingJobRole": {
          "Type": "AWS::IAM::Role",
          "Properties": {
            "RoleName": "golfnow-invoke-write-progress",
            "AssumeRolePolicyDocument": {
              "Version": "2012-10-17",
              "Statement": [
                {
                  "Effect": "Allow",
                  "Principal": {
                    "Service": [
                      "lambda.amazonaws.com",
                      "ecs-tasks.amazonaws.com"
                    ]
                  },
                  "Action": [
                    "sts:AssumeRole"
                  ]
                }
              ]
            },
            "Path": "/"
          }
        },