Search code examples
amazon-web-servicesdockeraws-cloudformationamazon-ecsmount

AWS: Mounting a template disk with Batch / ECS


I am using AWS Batch but wish to increase the disk space available to my containers. I am using CloudFormation to create the stack, and I have added an EC2::LaunchTemplate to add a 100GB disk to my instance: (extract from the stack)

BigDiskTemplate:
    Type: 'AWS::EC2::LaunchTemplate'
    Properties:
      LaunchTemplateData:
        BlockDeviceMappings:
          - DeviceName: '/dev/xvdcz'
            Ebs:
              Encrypted: true
              VolumeSize: 100
              VolumeType: gp2
      LaunchTemplateName: BigDiskTemplate
  MyComputeEnvironment:
    Type: 'AWS::Batch::ComputeEnvironment'
    Properties:
      Type: MANAGED
      ComputeEnvironmentName: MyEnv
      ComputeResources:
        Type: EC2
        MinvCpus: 0
        DesiredvCpus: 0
        MaxvCpus: 256
        LaunchTemplate:
          LaunchTemplateName: BigDiskTemplate
        InstanceTypes:
          - optimal
          - c5.large
        Subnets:
          - !Ref Subnet
        SecurityGroupIds:
          - !Ref SecurityGroup
        InstanceRole: !Ref IamInstanceProfile
      ServiceRole: !Ref BatchServiceRole
}

Yes, I want the disk to be ephemeral. Yes I know some of the EC2 instances support larger disks, but I also want to do this with GPU instances.

When I run lsblk in the container, I get:

NAME          MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
nvme1n1       259:0    0  100G  0 disk 
└─nvme1n1p1   259:6    0  100G  0 part 
nvme0n1       259:1    0    8G  0 disk 
├─nvme0n1p1   259:2    0    8G  0 part /etc/hosts
└─nvme0n1p128 259:3    0    1M  0 part 

Great! There's my 100GB disk. But I can't work out how to mount it. Based on samples and tutorials (admittedly for EC2) a template disk should be mountable with something like:

file -s /dev/nvme1n1
mkfs -t xfs /dev/nvme1n1

mkdir /data
mount /dev/nvme1n1 /data

However most of these steps give an error. Such as "/dev/nvme1n1: cannot open `/dev/nvme1n1' (No such file or directory)"; "mkfs.xfs: No such file or directory"; "mount: /data: permission denied" I have also tried different drive designations - eg. /dev/nvme1n1p1, nvme1n1, or /nvme1n1/nvme1n1p1

So how I mount this disk inside my container? Is Docker a part of the problem?


Solution

  • Here are the steps you should take:

    1. Attach the volume to the EC2 instance. Now lsblk should show the device.
    2. Mount (and format) the device to a location like /data. Set permissions and so on.
    3. In the task definition, declare a volume that points to /data.
    4. In your container definition, declare a mount point.

    You cannot / should not mount a device directly to a container.


    Edit: Details on step 2

    It is possible to add a user data script to your launch configuration. This way you can mount (and format) the devices. Take this script as an example:

    #!/bin/bash
    
    # Device name. NOT block name like 'nvme0n1p1'.
    device="/dev/sdp"
    
    # Where to mount the device.
    mountpoint="/data"
    
    # Wait for device.
    while [[ ! -b $(readlink -f ${device}) ]]; do
        echo "waiting for ${device}">&2; sleep 2;
    done
    
    # Format if not already formatted.
    blkid $(readlink -f ${device}) || mkfs -t ext4 $(readlink -f ${device})
    
    # Mount.
    mkdir -p ${mountpoint}
    mount $(readlink -f ${device}) ${mountpoint}
    chmod 666 ${mountpoint}
    
    # Persist the volume in /etc/fstab so it gets mounted again.
    echo "$(readlink -f ${device}) ${mountpoint} ext4 defaults,nofail 0 2" >> /etc/fstab