Search code examples
terraformamazon-emrterraform-provider-aws

In Terraform, can I recreate an EMR cluster resource when its bootstrap action contents change?


I'm not quite sure how to solve this problem in terraform.

We have an EMR cluster, with some bootstrap actions that are specified as S3 resources. A simplified view of our terraform config is:

resource "aws_s3_bucket_object" "bootstrap_action" {
  bucket = "${var.s3_emr_bootstrap_bucket}"
  key    = "bootstrap"
  content = <<EOF
#!/bin/bash
echo "Doing bootstrap actions"
EOF
}

resource "aws_emr_cluster" "emr_cluster" {

    ...configuration of the EMR cluster...

    bootstrap_action {
        path = "s3://${aws_s3_bucket_object.bootstrap_action.bucket}/${aws_s3_bucket_object.bootstrap_action.key}"
        name = "Bootstrap Step"
    }
}

What we'd like to do is make it so that changing the contents of the bootstrap action script cause the cluster to rebuild. Right now we have to manually taint the cluster if this changes.

I have tried using "depends_on", but that just affects ordering, it doesn't actually force a rebuild.

There is some discussion of this issue in: https://github.com/hashicorp/terraform/issues/8099 Reading that I don't see an obvious solution, but figured I'd post a question anyway.


Solution

  • You want to find some parameter of aws_emr_cluster that, when updated, causes the resource to be rebuilt. I usually use name or description if they are available and force a resource recreation. Name seems reasonable here.

    Do something like this:

    locals {
      script = <<EOF
    #!/bin/bash
    echo "Doing bootstrap actions"
    EOF
      script_sha = "${sha256(local.script)}"
    }
    

    ...

    name = "emr_cluster_name ${local.script_sha}"
    

    Then when the contents of your script changes, the name of the cluster changes, and the cluster is forced to rebuild.

    This obviously will not work if name is mutable on a resource. The issue you linked is a better discussion of how to solve this in the general case.