Search code examples
amazon-web-servicesamazon-s3servicecontainersamazon-ecs

Amazon ECS Task Failing to Download Environment Files from S3


I am encountering an issue with my ECS service where tasks are consistently failing during deployment. The specific error message I receive is as follows:

ResourceInitializationError: failed to download env files: file download command: non empty error stream: service call has been retried 5 time(s): RequestCanceled: request context canceled caused by: context deadline exceeded

  • ECS tasks are configured to download environment files from an S3 bucket.
  • My ECS service is in the Seoul region (ap-northeast-2), and the S3 bucket is in the US East (Ohio) region (us-east-2).
  • The S3 bucket and objects are not set to public access.

If you have any opinions on this issue or need additional information, please let me know immediately.

Please note that I am not a native English speaker, so I apologize for any lack of clarity in my explanation. I appreciate your understanding.

I suspect that the issue might be related to timeout settings, as the error indicates that the request is canceled after multiple retries due to a context deadline being exceeded. I have tried setting the startTimeout and stopTimeout in the task definition JSON to 120 seconds, but this has not resolved the issue.

Previously, we experienced similar issues, but they were automatically resolved through the retry process, so we did not make any changes. However, after adding a new scheduled task, this task did not operate successfully.

In an attempt to address this, as mentioned above, I revised the task definition after registering the new scheduled task and deployed it. Unfortunately, the service deployment also failed with the same error. Ultimately, I rolled back to a previous revision by removing the timeout settings from the JSON, which allowed the service deployment to proceed. However, the scheduled task still fails with the same error.


Solution

  • My task's VPC had a private subnet set up without a NAT Gateway, so it couldn't access the S3 bucket. After creating a NAT Gateway and adding a new route to the existing routing table for the NAT Gateway, the issue was resolved.

    I hope it helps someone who faces same issues.