Search code examples
scalaapache-sparkazure-databricks

Mounting ADLS - Secret does not exist with scope: <scopename> and key: <keynameforservicecredential>


I try to mount data lake gen2 to Databricks, but failing. I wonder why. There is Secret called "DataLakeSecret" in the App Registration. There is Key Vault Secret named "StorageGen2Secret" and it has secret value of DataLakeSecrets.

//http://www.stevedem.com/mounting-adls-gen2-in-databricks/
//https://docs.databricks.com/data/data-sources/azure/azure-datalake-gen2.html#mount-adls- 
   filesystem&language-scala

//Session configuration
val applicationid = "1111158b9-3525-4c62-8c48-d3d7e2c16a6a"
val secret = "1111xEPjpOIBJtBS-W9B9Zsv7h9IF:qw"
val tenantID = "11114839-0afa-4fae-a34a-326c42112bca"
val scopename = "key-vault-secrets"
val keynameforservicecredential = "StorageGen2Secret"
val fileSystemName = "fileshare1"
val storageaccountname = "111datalake"

val configs = Map(
  "fs.azure.account.auth.type" -> "OAuth",
  "fs.azure.account.oauth.provider.type" -> 
  "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
  "fs.azure.account.oauth2.client.id" -> "<applicationid>",
  "fs.azure.account.oauth2.client.secret" -> dbutils.secrets.get(scope = "<scopename>", key = " 
  <keynameforservicecredential>"),
        mountPoint = "/mnt/<mount-name>",
      extraConfigs = configs)

//ERROR:
ava.lang.IllegalArgumentException: Secret does not exist with scope: <scopename> and key: 
<keynameforservicecredential>
    at 
com.databricks.backend.common.rpc.SimpleSecretManagerClient.getSecret(SecretManagerClient.scala:228)
at com.databricks.dbutils_v1.impl.SecretUtilsImpl.getBytesInternal(SecretUtilsImpl.scala:46)
at com.databricks.dbutils_v1.impl.SecretUtilsImpl.get(SecretUtilsImpl.scala:61)

Solution

  • You may follow the below steps to create mount point using Azure Key-vault.

    You should have the following information:

    • Client ID (a.k.a. Application ID) => Key Name as ClientID = 06exxxxxxxxxxd60ef

    • Client Secret (a.k.a. Application Secret) => Key Name as ClientSecret = ArrIxxxxxxxxxxxxxxbMt]*

    • Directory ID (a.k.a Tenant ID) => Key Name as DirectoryID = https://login.microsoftonline.com//oauth2/token

    • Databricks Secret Scope Name => chepra

    • File System Name => filesystem

    • Storage Account Name => chepragen2

    • Mount Name => Kenny

    Azure Data Lake Gen2 mount normal method:

    Scala Code:

    val configs = Map(
      "fs.azure.account.auth.type" -> "OAuth",
      "fs.azure.account.oauth.provider.type" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
      "fs.azure.account.oauth2.client.id" -> "06ecXXXXXXXXXXXXXXXXXX60ef",
      "fs.azure.account.oauth2.client.secret" -> "ArXXXXXXXXXXXXXMt]*",
      "fs.azure.account.oauth2.client.endpoint" -> "https://login.microsoftonline.com/72f98XXXXXXXXXXXXXXXXXXX1db47/oauth2/token")
    
    // Optionally, you can add <directory-name> to the source URI of your mount point.
    dbutils.fs.mount(
      source = "abfss://[email protected]/",
      mountPoint = "/mnt/Kenny",
      extraConfigs = configs)
    

    enter image description here

    Azure Data Lake Gen2 mount using Azure Key-vault:

    Creating scope using Azure Key Vault:

    Note: Scope Name is the keyvault name i.e, "chepra" and Key are created as show.

    Go to Azure Portal => Select the Key Vault created => create secrets as follows:

    enter image description here

    Scala Code:

    val configs = Map(
                "fs.azure.account.auth.type" -> "OAuth",
                "fs.azure.account.oauth.provider.type" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
                "fs.azure.account.oauth2.client.id" -> dbutils.secrets.get(scope = "chepra", key = "ClientID"),
                "fs.azure.account.oauth2.client.secret" -> dbutils.secrets.get(scope = "chepra", key = "ClientSecret"),
                "fs.azure.account.oauth2.client.endpoint"-> dbutils.secrets.get(scope = "chepra", key = "DirectoryID"))
    
    // Optionally, you can add <directory-name> to the source URI of your mount point.
    dbutils.fs.mount(
      source = "abfss://[email protected]/",
      mountPoint = "/mnt/Kenny01",
      extraConfigs = configs)
    

    enter image description here