Search code examples
amazon-web-servicesamazon-ec2amazon-ecsaws-cli

How to create aws cluster with infrastructure as ec2 and asg using aws cli


In the AWS online UI, when creating a cluster, a user can select these additional options:

  • Infrastructure:

    • Amazon EC2 instances
      • Auto Scaling group: Create new ASG|select
      • Provisioning model: On-demand|Spot
      • Container instance Amazon Machine Image: Amazon Linux image etc.
      • EC2 instance type: t2.micro
      • EC2 instance role: Create New Role|Select
      • Desired capacity: min: 0, max 5 etc.
      • SSH Key pair:
  • Networking settings:

    • VPC:
    • Subnets:
    • Security Group:

When using the terminal - aws cli, the options are limited for creating a cluster.

See example: https://awscli.amazonaws.com/v2/documentation/api/2.3.2/reference/ecs/create-cluster.html

It is not possible to select most of these other options via the aws create-cluster command. At the very least I want to be able to select EC2 and ASG.

How is this typically done via aws cli? What other commands would I need to run in aws-cli to replicate what the UI "Create Cluster" does? My end goal is to push docker images to aws ec2 using aws-cli commands only, before moving onto other orchestration tools.

I don't need specific implementation details. Just the core steps. ie

  • create-cluster
  • create task definition
  • what's next and in what order?

I have so far followed the instructions: https://docs.aws.amazon.com/AmazonECR/latest/userguide/getting-started-cli.html

Step 1 - step 7.

I cannot find anymore tutorials in AWS documentation to describe my intentions above.


Solution

  • ok after a lot of debugging and tutorials (none of which are complete) I've established a working model of steps.

    Pre-requisite: Run and test local Dockerfile for single nginx hello world example

    Folder directory

    /deployScripts 
      ..scripts here...
    /project_name/
      DockerFile etc...
    

    Disclaimer This tutorial also is not complete because you are going to have to supply the variables as below.

    Also you will need to configure your current aws user login to have the permission for the following commands. This won't be too hard to discover because every command you follow below will give an error message to say you need that permission. You can simply go to your configured login permission and add them one by one. Eventually your permission will look something like this (not complete).

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "ecr:CreateRepository",
    ...ADD YOUR PERMISSION HERE...
                ],
                "Resource": "*"
            }
        ]
    }
    
    1. cd deployScripts

    2. Create-Repository.sh

    aws ecr create-repository --repository-name "${project_name}" > /dev/null
    
    1. Push.sh
    cd deployScripts
    aws ecr get-login-password --region ${region}|docker login --username AWS --password-stdin ${imageFullTAG}
    docker build -t ${project_name} ../${project_name}
    docker tag ${project_name} ${imageFullTAG}
    docker push ${imageFullTAG}
    
    1. create-cluster.sh
    aws ecs create-cluster --cluster-name $project_name
    
    1. Create-security-groups.sh
    #!/bin/sh
    
    securityGroupID=$(aws ec2 create-security-group --group-name $securityGroupName --description "Security group for docker" --vpc-id $vpcID | jq -r '.GroupId')
    
    # windows. Use alternative command for linux
    myip=$(curl ifcfg.me)
    
    # ssh from my ip
    aws ec2 authorize-security-group-ingress \
        --group-id $securityGroupID \
        --protocol tcp \
        --port 22 \
        --cidr $myip/32
    
    ## https from anywhere
    aws ec2 authorize-security-group-ingress \
        --group-id $securityGroupID \
        --protocol tcp \
        --port 443 \
        --cidr 0.0.0.0/0
    
    ## http from anywhere
    aws ec2 authorize-security-group-ingress \
        --group-id $securityGroupID \
        --protocol tcp \
        --port 80 \
        --cidr 0.0.0.0/0
    
    1. Run EC2 instance and associate with cluster

    user-data.template

    #!/bin/bash
    echo ECS_CLUSTER=$project_name >> /etc/ecs/ecs.config
    echo ECS_AVAILABLE_LOGGING_DRIVERS='["json-file","awslogs"]' >> /etc/ecs/ecs.config
    echo "ECS_CONTAINER_INSTANCE_TAGS={\"ECS_CLUSTER\": \"$project_name\"}" >> /etc/ecs/ecs.config
    

    Run-instance-associate-with-cluster.sh

    #!/bin/sh
    
    amiID=$(aws --profile default ssm get-parameter --name ' /aws/service/ecs/optimized-ami/amazon-linux-2/recommended/image_id' | jq -r '.Parameter.Value')
    
    vpcID=$(aws ec2 describe-vpcs | jq -r '.Vpcs[].VpcId')
    # subnets=$(aws --profile default ec2 describe-subnets --filters "Name=vpc-id,Values=vpc-0eb5674ddf242dea3" | jq -r '[.Subnets[] | select(.Tags[]?.Value) .SubnetId] | join(" ")')
    
    subnetID=$(aws --profile default ec2 describe-subnets --filters "Name=vpc-id,Values=${vpcID}" | jq -r '[.Subnets[] | select(.MapPublicIpOnLaunch==true) .SubnetId][0]')
    
    
    # Create user-data.txt file
    envsubst < ./user-data.template > ./user-data.txt
    
    instanceID=$(aws ec2 run-instances \
      --profile default \
      --image-id $amiID \
      --count 1 \
      --instance-type t2.micro \
      --iam-instance-profile Name=ecsInstanceRole \
      --key-name $keypair \
      --security-group-ids $securityGroupID \
      --subnet-id $subnetID \
      --user-data file://user-data.txt \
      --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=nginx-instance}]' \
      | jq -r '.Instances[] | .InstanceId')
    
    echo "Created instace for cluster: instanceID=${instanceID}"
    
    1. create-log-group.sh
    aws logs create-log-group --log-group-name $project_name
    
    1. Create Task definition

    Create both roles which both have permission "AmazonEC2ContainerServiceforEC2Role":

    • ecsTaskExecutionRole
    • ecs-task-role

    task-definition.template

    {
      "family": "$project_name",
      "requiresCompatibilities": [
        "EC2"
      ],
      "executionRoleArn": "arn:aws:iam::$accountID:role/ecsTaskExecutionRole",
      "taskRoleArn": "arn:aws:iam::$accountID:role/ecs-task-role",
      "networkMode": "bridge",
      "containerDefinitions": [
        {
          "name": "$project_name",
          "image": "$imageFullTAG",
          "cpu": 128,
          "memoryReservation": 256,
          "logConfiguration": {
            "logDriver": "awslogs",
            "options": {
              "awslogs-create-group": "true",
              "awslogs-group": "$project_name",
              "awslogs-region": "$region",
              "awslogs-stream-prefix": "ecs"
            }
          },
          "portMappings": [
            {
              "hostPort": 80,
              "containerPort": 80,
              "protocol": "tcp"
            }
          ],
          "essential": true,
          "healthCheck": {
            "command": [
              "CMD-SHELL",
              "curl -f http://localhost/ || exit 1"
            ],
            "interval": 10,
            "retries": 3,
            "startPeriod": 0,
            "timeout": 5
          }
        }
      ]
    }
    

    Create-task-definition.sh

    #!/bin/sh
    envsubst < ./task-definition.template > ./task-definition.json
    export taskDefinitionArn=$(aws ecs register-task-definition --cli-input-json file://task-definition.json | jq -r '.taskDefinition.taskDefinitionArn')
    echo "Created Task Definition: ${taskDefinitionArn}"
    
    1. create-service.sh
    #!/bin/sh
    serviceArn=$(aws ecs create-service \
    --cluster $project_name \
    --service-name $project_name \
    --task-definition $taskDefinitionArn \
    --desired-count 1 \
    --launch-type EC2 | jq -r '.service .serviceArn')
    
    echo "Created serviceArn=${serviceArn}"
    

    Visit the Public IPv4 DNS of the in AWS > EC2 > 'nginx-instance'

    done!

    Conclusion. The original answer as not to use aws cli commands because it isn't typically done this way. Alternatives are to use the AWS console or terraform or potentially ecs-cli, but how do you really learn the low level details if you do that? A lot of the terminology and steps are hidden. My steps above maybe able to refine a little more but I had lots of hurdles to get that to work.

    Notes: The main problems I had were following tutorials that weren't complete. For example including the image tag in the taskdefinition.json I added the shortcut name and that continously failed for me. I had to add the full image tag as it is pushed onto aws.