I try to mount data lake gen2 to Databricks, but failing. I wonder why. There is Secret called "DataLakeSecret" in the App Registration. There is Key Vault Secret named "StorageGen2Secret" and it has secret value of DataLakeSecrets.
//http://www.stevedem.com/mounting-adls-gen2-in-databricks/
//https://docs.databricks.com/data/data-sources/azure/azure-datalake-gen2.html#mount-adls-
filesystem&language-scala
//Session configuration
val applicationid = "1111158b9-3525-4c62-8c48-d3d7e2c16a6a"
val secret = "1111xEPjpOIBJtBS-W9B9Zsv7h9IF:qw"
val tenantID = "11114839-0afa-4fae-a34a-326c42112bca"
val scopename = "key-vault-secrets"
val keynameforservicecredential = "StorageGen2Secret"
val fileSystemName = "fileshare1"
val storageaccountname = "111datalake"
val configs = Map(
"fs.azure.account.auth.type" -> "OAuth",
"fs.azure.account.oauth.provider.type" ->
"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id" -> "<applicationid>",
"fs.azure.account.oauth2.client.secret" -> dbutils.secrets.get(scope = "<scopename>", key = "
<keynameforservicecredential>"),
mountPoint = "/mnt/<mount-name>",
extraConfigs = configs)
//ERROR:
ava.lang.IllegalArgumentException: Secret does not exist with scope: <scopename> and key:
<keynameforservicecredential>
at
com.databricks.backend.common.rpc.SimpleSecretManagerClient.getSecret(SecretManagerClient.scala:228)
at com.databricks.dbutils_v1.impl.SecretUtilsImpl.getBytesInternal(SecretUtilsImpl.scala:46)
at com.databricks.dbutils_v1.impl.SecretUtilsImpl.get(SecretUtilsImpl.scala:61)
You may follow the below steps to create mount point using Azure Key-vault.
You should have the following information:
• Client ID (a.k.a. Application ID) => Key Name as ClientID = 06exxxxxxxxxxd60ef
• Client Secret (a.k.a. Application Secret) => Key Name as ClientSecret = ArrIxxxxxxxxxxxxxxbMt]*
• Directory ID (a.k.a Tenant ID) => Key Name as DirectoryID = https://login.microsoftonline.com//oauth2/token
• Databricks Secret Scope Name => chepra
• File System Name => filesystem
• Storage Account Name => chepragen2
• Mount Name => Kenny
Azure Data Lake Gen2 mount normal method:
Scala Code:
val configs = Map(
"fs.azure.account.auth.type" -> "OAuth",
"fs.azure.account.oauth.provider.type" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id" -> "06ecXXXXXXXXXXXXXXXXXX60ef",
"fs.azure.account.oauth2.client.secret" -> "ArXXXXXXXXXXXXXMt]*",
"fs.azure.account.oauth2.client.endpoint" -> "https://login.microsoftonline.com/72f98XXXXXXXXXXXXXXXXXXX1db47/oauth2/token")
// Optionally, you can add <directory-name> to the source URI of your mount point.
dbutils.fs.mount(
source = "abfss://[email protected]/",
mountPoint = "/mnt/Kenny",
extraConfigs = configs)
Azure Data Lake Gen2 mount using Azure Key-vault:
Creating scope using Azure Key Vault:
Note: Scope Name is the keyvault name i.e, "chepra" and Key are created as show.
Go to Azure Portal => Select the Key Vault created => create secrets as follows:
ClientID = 06exxxxxxxxxxd60ef
ClientSecret = ArrIxxxxxxxxxxxxxxbMt]*
DirectoryID = https://login.microsoftonline.com//oauth2/token
Scala Code:
val configs = Map(
"fs.azure.account.auth.type" -> "OAuth",
"fs.azure.account.oauth.provider.type" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id" -> dbutils.secrets.get(scope = "chepra", key = "ClientID"),
"fs.azure.account.oauth2.client.secret" -> dbutils.secrets.get(scope = "chepra", key = "ClientSecret"),
"fs.azure.account.oauth2.client.endpoint"-> dbutils.secrets.get(scope = "chepra", key = "DirectoryID"))
// Optionally, you can add <directory-name> to the source URI of your mount point.
dbutils.fs.mount(
source = "abfss://[email protected]/",
mountPoint = "/mnt/Kenny01",
extraConfigs = configs)