I am using Terraform to configure a BigQuery Scheduled Query in GCP. Because the query is quite big, I have opted to use HEREDOC to allow me to set write it as a multi-line string.
Here is a simplified version of the config:
resource "google_bigquery_data_transfer_config" "refresh_view" {
display_name = "refresh-view"
location = var.bigquery_dataset_location
data_source_id = "scheduled_query"
schedule = "every day 06:00"
destination_dataset_id = google_bigquery_dataset.dataset.dataset_id
params = {
destination_table_name_template = "my_joined_view"
write_disposition = "WRITE_TRUNCATE"
query = local.local_query
}
depends_on = [var.service_account_member]
}
locals {
local_query = <<-EOT
SELECT
table_a.*,
table_b.*,
FROM `${var.project_id}.dataset.table_a` as table_a
JOIN `${var.project_id}.dataset.table_b` as table_b ON table_b.table_a_id = table_a_id
EOT
}
The issue I am encountering is that each time I run terraform plan, the 'query' triggers an update for this resource:
# module.bigquery.google_bigquery_data_transfer_config.refresh_view will be updated in-place
~ resource "google_bigquery_data_transfer_config" "refresh_view" {
id = "..."
name = "..."
~ params = {
~ "query" = <<-EOT
SELECT
....
Although nothing has changed, terraform always marks the 'query' param as changed. Nothing in the actual query itself is marked as changed. (Note: If I deliberately modify line of the SQL, the modified line of SQL would be marked in the terraform plan output, but in this case the query has not changed and nothing is marked).
I think I have narrowed down the issue to using the HEREDOC string. I have tested:
The only thing that helped was changing the query into single-line string. This is however an awful solution, since the query is fairly long and this approach makes it unreadable.
...expanding on my comment
Here is one option to "minify" the query that we use:
variable "project_id" {
default = "abc"
}
locals {
query = <<-EOT
SELECT
table_a.*,
table_b.*,
FROM `${var.project_id}.dataset.table_a` as table_a
JOIN `${var.project_id}.dataset.table_b` as table_b
ON table_b.table_a_id = table_a_id
EOT
}
output "min_query" {
value = trimspace(replace(local.query, "/\\s+/", " "))
}
just two functions:
trimspace
no parameters, it removes any spaces from the start and endreplace
on this one we use a regex to replace all "extra spaces" from the queryand a terraform plan on that will show:
Changes to Outputs:
+ min_query = "SELECT table_a.*, table_b.*, FROM `abc.dataset.table_a` as table_a JOIN `abc.dataset.table_b` as table_b ON table_b.table_a_id = table_a_id"
That way in the code the query we keep it indented and easy to read, but we send the minified version to avoid the weird update behavior in the google_bigquery_data_transfer_config
resource