Search code examples
pysparkjupyter-notebookdatabricksazure-databricksazure-data-lake-gen2

How to access different storage accounts with same container name in databricks notebooks


I have 2 different storage accounts with same container name. Lets say tenant1 and tenant2 as storage account name with "appdata" as container name in both accounts. I can create and mount both containers to dbfs. But i am unable to read/write dynamically by passing storage account names to the mount point code. since dbfs has mnt/containername as mount point in dbfs, only latest or previously passed storage account's mount point is being referred in databricks. How to achieve my goal here?


Solution

  • Mount points should be static, so you just need to have two different mount points pointing to the correct container, something like this:

    /mnt/storage1_appdata
    /mnt/storage2_appdata
    

    so if you want your code be dynamic, use the f"/mnt/{storage_name}_appdata".

    It's not recommended to dynamically remount containers - you can get cryptic errors when you remount mount point while somebody is reading/writing data using it.

    Also, you can access ADLS directly if you specify correct configuration for your cluster/job (see doc) - you can even access both containers at the same time, just need to setup configuration for both storage accounts:

    spark.conf.set("fs.azure.account.auth.type.<storage-account-name>.dfs.core.windows.net", 
      "OAuth")
    spark.conf.set(
      "fs.azure.account.oauth.provider.type.<storage-account-name>.dfs.core.windows.net", 
      "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
    spark.conf.set(
      "fs.azure.account.oauth2.client.id.<storage-account-name>.dfs.core.windows.net", 
      "<application-id>")
    spark.conf.set(
      "fs.azure.account.oauth2.client.secret.<storage-account-name>.dfs.core.windows.net", 
      dbutils.secrets.get(scope="<scope-name>",key="<service-credential-key-name>"))
    spark.conf.set(
      "fs.azure.account.oauth2.client.endpoint.<storage-account-name>.dfs.core.windows.net", 
      "https://login.microsoftonline.com/<directory-id>/oauth2/token")