amazon-web-services terraform autoscaling amazon-ecs blue-green-deployment

How to implement blue/green deployments in AWS with Terraform without losing capacity

I have seen multiple articles discussing blue/green deployments and they consistently involve forcing recreation of the Launch Configuration and the Autoscaling Group. For example:

https://groups.google.com/forum/#!msg/terraform-tool/7Gdhv1OAc80/iNQ93riiLwAJ

This works great in general except that the desired capacity of the ASG gets reset to the default. So if my cluster is under load then there will be a sudden drop in capacity.

My question is this: is there a way to execute a Terraform blue/green deployment without a loss of capacity?

Solution

I don't have a full terraform-only solution to this.

The approach I have is to run a small script to get the current desired capacity, set a variable, and then use that variable in the asg.

handle-desired-capacity:
    @echo "Handling current desired capacity"
    @echo "---------------------------------"
    @if [ "$(env)" == "" ]; then \
        echo "Cannot continue without an environment"; \
        exit -1; \
    fi
    $(eval DESIRED_CAPACITY := $(shell aws autoscaling describe-auto-scaling-groups --profile $(env) | jq -SMc '.AutoScalingGroups[] | select((.Tags[]|select(.Key=="Name")|.Value) | match("prod-asg-app")).DesiredCapacity'))
    @if [ "$(DESIRED_CAPACITY)" == '' ]; then \
        echo Could not determine desired capacity.; \
        exit -1; \
    fi
    @if [ "$(DESIRED_CAPACITY)" -lt 2 -o "$(DESIRED_CAPACITY)" -gt 10 ]; then \
        echo Can only deploy between 2 and 10 instances.; \
        exit -1; \
    fi
    @echo "Desired Capacity is $(DESIRED_CAPACITY)"
    @sed -i.bak 's!desired_capacity = [0-9]*!desired_capacity = $(DESIRED_CAPACITY)!g' $(env)/terraform.tfvars
    @rm -f $(env)/terraform.tfvars.bak
    @echo ""

Clearly, this is as ugly as it gets, but it does the job.

I am looking to see if we can get the name of the ASG as an output from the remote state that I can then use on the next run to get the desired capacity, but I'm struggling to understand this enough to make it useful.