Search code examples
boto3amazon-ecsamazon-efs

AWS EFS Mounting To ECS Fargate Task Fails Abruptly


I'm using Boto3 to run a task definition using Fargate and also mount an empty EFS system to it.

The SDK code:

def run_fargate_task(self):
    response = self.ecs.run_task(
        cluster='XXXXXXXX',
        count=1,
        enableECSManagedTags=True,
        launchType='FARGATE',
        networkConfiguration={
            'awsvpcConfiguration': {
                'securityGroups': [
                    'sg-XXXXXXXXXXXXXXXXX',
                ],
                'subnets': [
                    'subnet-XXXXXXXXXXXXXXXXX',
                    'subnet-XXXXXXXXXXXXXXXXX',
                ],
                'assignPublicIp': 'ENABLED'
            }
        },
        taskDefinition='XXXXX:1'
    )

When I run the run_fargate_task() function it succeeds most of the time with this response metadata:

"ResponseMetadata": {
    "RequestId": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
        "x-amzn-requestid": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
        "content-type": "application/x-amz-json-1.1",
        "content-length": "1297",
        "date": "Fri, 11 Jun 2021 14:34:46 GMT",
    },
    "RetryAttempts": 0,
},

But sometimes I get this error:

ResourceInitializationError: failed to invoke EFS utils commands to set up EFS volumes: stderr: Failed to resolve "fs-XXXXXX.efs.us-east-2.amazonaws.com" - check that your file system ID is correct.

I'm sure the EFS system ID is correct since it sometimes succeeds without any change to the code. Also, I'm sure my security group's inbound rule is set for EFS with port 2049.

Also, I made sure I'm not running multiple tasks with the same EFS system ID.

Even after stopping the task (when it runs successfully) and waiting a few minutes (hoping the EFS system is not held by the old task) the problem still persists.


Solution

  • You have specified 2 subnets for fargate. Does your EFS filesystem contain a mount point for both AZs/subnets?

    As @jordanm commented above this was my problem indeed.