Search code examples
amazon-web-servicesamazon-ecsautoscalingaws-fargate

Up-to-date role needed in CloudFormation for autoscaling AWS ECS Fargate


I'm slowly trudging through getting an elegant set of CloudFormation templates for setting up a load-balanced Docker application running on AWS ECS Fargate. I have a VPC with two public subnets, an Internet gateway, a load balancer, and a single ECS service with a single ECS task that is pulling a (for now hard-coded) container from ECR, using an auto-provisioned certificate in conjunction with Route 53 for SSL. It works

I'm now adding autoscaling, and whew, it's starting to get confusing. The reason is that so much documentation isn't clear about whether it's for EC2 specifically, or for Fargate, or manually done through the UI, or whether it uses a built-in role, whether the role was created via the UI even though the service is created in CloudFormation, which role to use, etc.

From what I can tell at this point, if I am using Fargate I only need to define a AWS::ApplicationAutoScaling::ScalableTarget and a AWS::ApplicationAutoScaling::ScalingPolicy. (Apparently the scalable target manages CloudWatch for me?) But the scalable target requires a role that I presume allows access to CloudWatch for setting up alarms and to ECS for scaling up or scaling down. And this is where the documentation gets pretty vague/confusing, and where the online examples are all over the place.

Does AWS provide a built-in role I should be using for autoscaling? The examples seem to apply that it doesn't, and that I'll need to create one. A few examples declare a new role inside the CloudFormation template (good, because I don't want to create it manually), and they have the Statement part in common—basically Allow the sts:AssumeRole action for the service ecs-tasks.amazonaws.com. But what about the policy?

One example seems to create a policy completely from scratch, with a path of /, a policy name of root, and rules allowing the ecs:UpdateService, cloudwatch:PutMetricAlarm, etc. (This other example does too.) But yet another example seems to simply reference an existing AWS managed policy arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceAutoscaleRole:

AutoScalingRole:
  Type: AWS::IAM::Role
  Properties:
    RoleName: !Join ['', [!Ref ServiceName, AutoScalingRole]]
    AssumeRolePolicyDocument:
      Statement:
        - Effect: Allow
          Principal:
            Service: ecs-tasks.amazonaws.com
          Action: 'sts:AssumeRole'
    ManagedPolicyArns:
      - 'arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceAutoscaleRole'

But wouldn't you know it, I read that AmazonEC2ContainerServiceAutoscaleRol is being phased out. Instead is says to see this other page about using an "Application Auto Scaling service-linked role for Amazon ECS". And that other page says:

You don't need to manually create a service-linked role. Application Auto Scaling creates the appropriate service-linked role for you when you call RegisterScalableTarget. For example, if you set up automatic scaling for an Amazon ECS service, Application Auto Scaling creates the AWSServiceRoleForApplicationAutoScaling_ECSService role.

And now I've come full circle, more confused than ever. So all those example are out of date? I can get by without even creating a role to auto-scale Fargate? But what do I specify as the RoleARN for AWS::ApplicationAutoScaling::ScalableTarget?

And apparently I'll need to specify permission for it to create that new role, and to do that … I'll need to declare another role with that permission?

I'm at the point where each step is just adding more confusion. Can someone just break this down in simple terms what is needed in 2023 in terms of IAM to get simple autoscaling working with an existing load balancer and a single Fargate service with a single task?


Solution

  • The most up-to-date information (however terse) for providing a role for ECS Fargate autoscaling can be found in Service-linked roles for Application Auto Scaling and the documents it references. Every single CloudFormation example I've found on the Internet is out-of-date, referring either to phased-out policies or even creating inline roles with a list of specific allowed actions. Even one of the most recent Udemy CloudFormation classes I've seen still jumps out of CloudFormation and goes and manually creates a role for autoscaling. But in fact none of that is needed.

    In short there are certain roles that AWS will create automatically to allow access to certain services. These are called "service-linked roles", and the one we want to use here for ECS Fargate autoscaling is arn:aws:iam::012345678910:role/aws-service-role/ecs.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_ECSService, where 012345678910 is the account ID. This role can be specified as the RoleARN in the autoscaling target group.

    To create the AWSServiceRoleForApplicationAutoScaling_ECSService role if it doesn't exist already, the iam:CreateServiceLinkedRole permission is needed. But importantly when using CloudFormation the role seems to be created when the CloudFormation stack is created. In other words, the scaling policy itself doesn't need the iam:CreateServiceLinkedRole permission; merely the user creating the CloudFormation stack needs to have the iam:CreateServiceLinkedRole permission.

    Thus simple autoscaling based upon CPU utilization in an ECS Fargate cluster ECSCluster with service ECSService can be turned on using something like this:

      AutoScalingTarget:
        Type: AWS::ApplicationAutoScaling::ScalableTarget
        Properties:
          ServiceNamespace: ecs
          ResourceId: !Sub "service/${ECSCluster}/${ECSService.Name}"
          ScalableDimension: ecs:service:DesiredCount
          MinCapacity: 1
          MaxCapacity: 3
          RoleARN: !Sub "arn:aws:iam::${AWS::AccountId}:role/aws-service-role/ecs.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_ECSService"
    
      WebAutoScalingPolicy:
        Type: AWS::ApplicationAutoScaling::ScalingPolicy
        Properties:
          PolicyType: TargetTrackingScaling
          ScalingTargetId: !Ref WebAutoScalingTarget
          TargetTrackingScalingPolicyConfiguration:
            PredefinedMetricSpecification:
              PredefinedMetricType: ECSServiceAverageCPUUtilization
            TargetValue: 70
    

    (I haven't verified this exact autoscaling configuration, because initially I am using PredefinedMetricType: ALBRequestCountPerTarget for easier testing. The the role should be the same in either cases.)