Search code examples
google-cloud-platformgoogle-bigqueryterraformdataset

How to create a multiple datasets and set acess in BigQuery


I need to create multiple datasets and assign a service account to each, giving access with the BigQuery Admin role.

variables.tf

variable "project_id" {
  type = string
  default = "<projectid>"
}

variable "set_location" {
  type = string
  default = "southamerica-east1"
}

variable "dataset_name" {
  type = list
  default = ["firs-dataset",
              "second-dataset"]
}

main.tf

resource "google_bigquery_dataset" "dataset" {
  dataset_id = "${var.dataset_name[count.index]}"
  count = length("${var.dataset_name}")
  location = "${var.set_location}"

  access {
    role = "roles/bigquery.admin"
    user_by_email = "<service-account>"
  }
}

With that I can create multiple datasets, but this way I can put permission on just one service account for all datasets.

I need each dataset to have a specific service account with the BigQueryAdmin role.


Solution

  • You can configure datasets in a json file.

    The Terraform module structure used in this example is :

    datasets
      resource/datasets.json
      main.tf
      locals.tf
    

    datasets.json file :

    {
      "datasets": {
        "dataset1": {
          "dataset_id": "dataset1",
          "location" : "EU",
          "friendly_name" : "Name",
          "description" : "Description",
          "role": "roles/bigquery.admin",
          "service_account": "account1@<project>.iam.gserviceaccount.com"
        },
        "dataset2": {
          "dataset_id": "dataset2",
          "location" : "EU",
          "friendly_name" : "Name",
          "description" : "Description",
          "role": "roles/bigquery.admin",
          "service_account": "account2@<project>.iam.gserviceaccount.com"
        }
      }
    }
    

    You can then retrieve this list in a locals.tf file :

    locals {
      datasets = jsondecode(file("${path.module}/resource/datasets.json"))["datasets"]
    }
    

    In the main.tf file, we loop on the previous datasets configured list :

    resource "google_bigquery_dataset" "datasets" {
      for_each = local.datasets
    
      project                     = var.project_id
      dataset_id                  = each.value["dataset_id"]
      friendly_name               = each.value["friendly_name"]
      description                 = each.value["description"]
      location                    = each.value["location"]
    
      access {
        role = each.value["role"]
        user_by_email = each.value["service_account"]
      }
    }