Search code examples
azureazure-databricksterraform-provider-azureterraform-provider-databricks

Getting Error with Azure Databricks and Terraform


I have the below code for my Databricks. At the moment i just have Workspace but no clusters in my Workspace

  required_providers {
    azuread     = "~> 1.0"
    azurerm     = "~> 2.0"
    azuredevops = { source = "registry.terraform.io/microsoft/azuredevops", version = "~> 0.0" }
    databricks  = { source = "registry.terraform.io/databrickslabs/databricks", version = "~> 0.0" }
  }
}

provider "random" {}
provider "azuread" {
  tenant_id     = var.project.arm.tenant.id
  client_id     = var.project.arm.client.id
  client_secret = var.secret.arm.client.secret
}

provider "databricks" {
  host                        = azurerm_databricks_workspace.db-workspace.workspace_url
  azure_workspace_resource_id = azurerm_databricks_workspace.db-workspace.id
  azure_tenant_id             = var.project.arm.tenant.id
  azure_client_id             = var.project.arm.client.id
  azure_client_secret         = var.secret.arm.client.secret
}


resource "azurerm_databricks_workspace" "db-workspace" {
  name                          = module.names-db-workspace.environment.databricks_workspace.name_unique
  resource_group_name           = module.resourcegroup.resource_group.name
  location                      = module.resourcegroup.resource_group.location
  sku                           = "premium"
  public_network_access_enabled = true

  custom_parameters {
    no_public_ip                                         = true
    virtual_network_id                                   = module.virtualnetwork["centralus"].virtual_network.self.id
    public_subnet_name                                   = module.virtualnetwork["centralus"].virtual_network.subnets["db-sub-1-public"].name
    private_subnet_name                                  = module.virtualnetwork["centralus"].virtual_network.subnets["db-sub-2-private"].name
    public_subnet_network_security_group_association_id  = module.virtualnetwork["centralus"].virtual_network.nsgs.associations.subnets["databricks-public-nsg-db-sub-1-public"].id
    private_subnet_network_security_group_association_id = module.virtualnetwork["centralus"].virtual_network.nsgs.associations.subnets["databricks-private-nsg-db-sub-2-private"].id
  }
  tags = local.tags
}

Databricks Cluster Creation

resource "databricks_cluster" "dbcselfservice" {
  cluster_name            = format("adb-cluster-%s-%s", var.project.name, var.project.environment.name)
  spark_version           = var.spark_version
  node_type_id            = var.node_type_id
  autotermination_minutes = 20
  autoscale {
    min_workers = 1
    max_workers = 7
  }
  azure_attributes {
    availability       = "SPOT_AZURE"
    first_on_demand    = 1
    spot_bid_max_price = 100
  }
  depends_on = [
    azurerm_databricks_workspace.db-workspace
  ]
}

Databricks Workspace RBAC Permission

resource "databricks_group" "db-group" {
  display_name               = format("adb-users-%s", var.project.name)
  allow_cluster_create       = true
  allow_instance_pool_create = true
  depends_on = [
    resource.azurerm_databricks_workspace.db-workspace
  ]
}

resource "databricks_user" "dbuser" {
  count            = length(local.display_name)
  display_name     = local.display_name[count.index]
  user_name        = local.user_name[count.index]
  workspace_access = true
  depends_on = [
    resource.azurerm_databricks_workspace.db-workspace
  ]
}

Adding Members to Databricks Admin Group

resource "databricks_group_member" "i-am-admin" {
  for_each  = toset(local.email_address)
  group_id  = data.databricks_group.admins.id
  member_id = databricks_user.dbuser[index(local.email_address, each.key)].id
  depends_on = [
    resource.azurerm_databricks_workspace.db-workspace
  ]
}

data "databricks_group" "admins" {
  display_name = "admins"
  depends_on = [
    #    resource.databricks_cluster.dbcselfservice,
    resource.azurerm_databricks_workspace.db-workspace
  ]
}

When i try to run Terraform plan i get the below error :

Error: cannot read group: cannot configure azure-client-secret auth: cannot get workspace: please set `azure_workspace_resource_id` provider argument. Attributes used: azure_client_id, azure_client_secret, azure_tenant_id. Please check https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs#authentication for details
│ 
│   with databricks_group.db-group,
│   on resources.adb.tf line 71, in resource "databricks_group" "db-group":
│   71: resource "databricks_group" "db-group" {
│ 
╵
╷
│ Error: cannot read user: cannot configure azure-client-secret auth: cannot get workspace: please set `azure_workspace_resource_id` provider argument. Attributes used: azure_client_id, azure_client_secret, azure_tenant_id. Please check https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs#authentication for details
│ 
│   with databricks_user.dbuser[0],
│   on resources.adb.tf line 80, in resource "databricks_user" "dbuser":
│   80: resource "databricks_user" "dbuser" {

But if i comment out "custom_parameters" in the resource block "azurerm_databricks_workspace" , i don't see the error. In Azure, i just have the Databricks Workspace and no cluster, i want to create the cluster and planning to run Terraform for the second time

Few weeks back i deleted and recreated my subnets. So now my subnets have new names

So now if i comment out the custom_parameters, Terraform Apply throws error while cluster creation and says that it cannot find the old subnet. But then my subnet reference is in the custom_parameters and i had to comment it out.

So i am in a catch 22 situation now. Any idea how to fix this


Solution

  • There are few changes in your code please change it as suggested below.

    Use azurerm_databricks_workspace.db-workspace in depends on of db-group,dbuser,i-am-admin and admins instead of resource.azurerm_databricks_workspace.db-workspace.

    As suggested in this Github Disscussion try with Azurerm provider version 2.78 and for as a workaround for now, please first apply the workspace creation, and then resources within it.