Search code examples
amazon-web-servicesamazon-ecsaws-fargateecs-taskdefinition

AWS fargate tasks won't start reliably


I have an ECS cluster with a bunch of different tasks in it (using the same docker image but with different environment variables).

Some of the tasks come up without problem but others fail a lot even though i've used the same VPC, subnet and security-group. The error message shows ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve ecr registry auth: service call has been retried 3 time(s): RequestError: send request failed caused by: Post https://api.ecr..

Bizarre is that the same task sometimes comes up if i create a new task definition or delete the ECR repository and re-upload the docker image.

I'm unable to draw any conclusion out of this..

Update: strange... the task starts successfully when i deregister the task definition and recreate it with the same specs. But only once..


Solution

  • It turns out one have to select the taskExecution role on Task Role - override and Task Execution Role - override in the run task Advanced Options section when starting the task. I don't know why it was arbitrarily working when randomly trying or working when i recreated the task definition every time.