Search code examples
terraformterraform-provider-gcp

Using HEREDOC in Terraform config always triggers update


I am using Terraform to configure a BigQuery Scheduled Query in GCP. Because the query is quite big, I have opted to use HEREDOC to allow me to set write it as a multi-line string.

Here is a simplified version of the config:

resource "google_bigquery_data_transfer_config" "refresh_view" {
  display_name           = "refresh-view"
  location               = var.bigquery_dataset_location
  data_source_id         = "scheduled_query"
  schedule               = "every day 06:00"
  destination_dataset_id = google_bigquery_dataset.dataset.dataset_id
  params = {
    destination_table_name_template = "my_joined_view"
    write_disposition               = "WRITE_TRUNCATE"
    query                           = local.local_query
  }

  depends_on = [var.service_account_member]
}

locals {
  local_query = <<-EOT
      SELECT
          table_a.*,
          table_b.*,
      FROM `${var.project_id}.dataset.table_a` as table_a
      JOIN `${var.project_id}.dataset.table_b` as table_b ON table_b.table_a_id = table_a_id
    EOT
}

The issue I am encountering is that each time I run terraform plan, the 'query' triggers an update for this resource:

  # module.bigquery.google_bigquery_data_transfer_config.refresh_view will be updated in-place
  ~ resource "google_bigquery_data_transfer_config" "refresh_view" {
        id                       = "..."
        name                     = "..."
      ~ params                   = {
          ~ "query"                           = <<-EOT
                SELECT
                ....

Although nothing has changed, terraform always marks the 'query' param as changed. Nothing in the actual query itself is marked as changed. (Note: If I deliberately modify line of the SQL, the modified line of SQL would be marked in the terraform plan output, but in this case the query has not changed and nothing is marked).

I think I have narrowed down the issue to using the HEREDOC string. I have tested:

  • replacing the variables in the query (e.g. project_id} with a hard-coded value,
  • moving the query from locals block directly to the resource config,
  • removing whitespace characters (indentation) from the query.

The only thing that helped was changing the query into single-line string. This is however an awful solution, since the query is fairly long and this approach makes it unreadable.


Solution

  • ...expanding on my comment

    Here is one option to "minify" the query that we use:

    variable "project_id" {
      default = "abc"
    }
    
    locals {
      query = <<-EOT
          SELECT
              table_a.*,
              table_b.*,
          FROM `${var.project_id}.dataset.table_a` as table_a
          JOIN `${var.project_id}.dataset.table_b` as table_b
          ON table_b.table_a_id = table_a_id
        EOT
    }
    
    output "min_query" {
      value = trimspace(replace(local.query, "/\\s+/", " "))
    }
    

    just two functions:

    • trimspace no parameters, it removes any spaces from the start and end
    • replace on this one we use a regex to replace all "extra spaces" from the query

    and a terraform plan on that will show:

    Changes to Outputs:
      + min_query = "SELECT table_a.*, table_b.*, FROM `abc.dataset.table_a` as table_a JOIN `abc.dataset.table_b` as table_b ON table_b.table_a_id = table_a_id"
    

    That way in the code the query we keep it indented and easy to read, but we send the minified version to avoid the weird update behavior in the google_bigquery_data_transfer_config resource