Search code examples
terraformterraform-provider-aws

Archiving files in terraform using fileset()


I have Lambda function below directory structure:

-src/
   --lambda/
      test.py
   --shared/
      constants.py
      utils.py
      exceptions.py
      random.py (do not want in lambda)
      something.py (do not want in lambda)
   --dummyjsons/
      <100 json files>

I am trying to deploy this to AWS using Terraform and I am trying to use archive_file to archive all of these files.

The data block in terraform currently looks like:

data "archive_file" "test"{
  type        = "zip"
  output_path = ".terrafrom/test.zip"

  source {
  content = file("src/lambda/test.py")
  filename = "test.py"
  }

  source {
  content = file("src/shared/constants.py")
  filename = "src/shared/constants.py"
  }

  source {
  content = file("src/shared/utils.py")
  filename = "src/shared/utils.py"
  }

  source {
  content = file("src/shared/exceptions.py")
  filename = "src/shared/exceptions.py"
  }

  # Include all dummyjsons
  for_each = fileset("src/dummyjsons", "*.json")
  source {
    content  = file("src/dummyjsons/${each.value}")
    filename = "src/dummyjsons/${each.value}"
  }
}

The lambda block looks like:

resource "aws_lambda_function" "test" {
  # Other attributes 

  filename = data.archive_file.test.output_path
  source_code_hash = data.archive_file.test.output_base64sha256

  # Other attributes
}

Attempt 1:

I got the below error upon trying the above code for lines filename and source_code_hash from lambda block:

│ Error: Missing resource instance key
│ Because data.archive_file.test has "for_each" set, its
│ attributes must be accessed on specific instances.
│ 
│ For example, to correlate with indices of a referring resource, use:
│     data.archive_file.test[each.key]

Attempt 2:

I updated my lambda terraform block to below:

resource "aws_lambda_function" "test" {
  # Other attributes 

  filename = data.archive_file.test[each.key].output_path
  source_code_hash = data.archive_file.test[each.key].output_base64sha256

  # Other attributes
}

And got the below error for the same lines:

│ Error: Reference to "each" in context without for_each
│ 
│ The "each" object can be used only in "module" or "resource" blocks, and
│ only when the "for_each" argument is set.

Do you know how else to make this work?


Solution

  • I'm assuming that your goal is to add each of the files matched by *.json under src/dummyjsons to your archive.

    That means that you'd need to have one source block for each of those files. Your first attempt failed because you instead declared one entire archive_file per source file: if that had worked, the many separate instances of archive_file.test would have all tried to write to the same output file, with each write containing only one of the source files.

    You can ask Terraform to dynamically generate zero or more nested source blocks using a dynamic block:

    data "archive_file" "test"{
      type        = "zip"
      output_path = "${path.module}/test.zip"
    
      source {
        content  = file("${path.module}/src/lambda/test.py")
        filename = "test.py"
      }
    
      # (...and all of the other hard-coded ones...)
    
      # Include all dummyjsons
      dynamic "source" {
        for_each = fileset("${path.module}/src/dummyjsons", "*.json")
        content {
          content  = file("${path.module}/src/dummyjsons/${source.value}")
          filename = "src/dummyjsons/${source.value}"
        }
      }
    }
    

    Notice that the for_each argument is now inside the dynamic block, and so the repetition is limited only to that block. dynamic blocks are slightly different to resource-level for_each in that Terraform literally generates multiple separate source blocks based on the given specification, so from the perspective of the hashicorp/archive provider (which is the one implementing this archive_file data source) this is indistinguishable from you having written out multiple source blocks directly.

    Because the dynamic block has a smaller scope than for_each directly inside your data block, you refer to the value of the current element using source.value instead of each.value. The "source" identifier there is named after the block type.