azure databricks azure-databricks azure-bicep

databricks deployment bicep suddenly failing

The following bicep deployment has suddenly started failing and I do not understand why

var managedResourceGroupName = 'rg-metadata-${dataBricksName}'

resource databrickworkspace 'Microsoft.Databricks/workspaces@2018-04-01' = {
  name: dataBrickName
  location: location
  sku: {
    name: dataBrickPricingTier
  }
  properties: {
    managedResourceGroupId: managedResourceGroup.id
    parameters: {
      enableNoPublicIp: {
        value: disablePublicIp
      }
    }
  }
}

resource managedResourceGroup 'Microsoft.Resources/resourceGroups@2021-04-01' existing = {
  scope: subscription()
  name: managedResourceGroupName
}

This pretty much follows the same doc : here.

The error message I get is

The filter 'principalId eq ''' is not supported. Supported filters are either 'atScope()' or 'principalId eq '{value}' or assignedTo('{value}')

This error message is quite cyrptic and does not tell me much. Also, this has been succeeding so far. The problem has started now.

Looking at the full doc here , I can see it has things like

It says required, but the definition isn't clear as to what do I have to put there. I have never had to use the principalId so far, during the creation of a databricks workspace.

Another fun fact is: the deletion of the resource (databricks workspace which is anyways in a failed state), does not go through and fails too or gets stuck.

Solution

I raised a MS Support case for this, as I ended up spending a lot of hours on it. Also, it wasn't something wrong that I was doing- especially because it worked in the past and also, that the same bicep worked flawlessly the next morning.

MS supporthave acknowledged it was a service failure (details below). So, if some of you encounter these cryptic errors and have no idea, what could be going wrong, raise a support case.

**Summary of Impact**: 
A few users using Azure Databricks may have experienced failures during Workspace creation with 
Premium Pricing Tier.
 
**Preliminary Root Cause**: 
We determined that a new feature deployed in a dependent backend service, intended to aid 
in the mapping of new workspaces, encountered rate limit issues during the resource provisioning 
process, causing Workspace creation failures as mentioned above.
Mitigation: We performed a rollback of the new feature deployment to a previously known good 
version to mitigate the issue and restore functionality. Additionally, service health has been 
closely monitored to ensure no failures are detected before the issue was declared as fully 
mitigated.Full-service functionality has been confirmed.