I have 2 different storage accounts with same container name. Lets say tenant1 and tenant2 as storage account name with "appdata" as container name in both accounts. I can create and mount both containers to dbfs. But i am unable to read/write dynamically by passing storage account names to the mount point code. since dbfs has mnt/containername as mount point in dbfs, only latest or previously passed storage account's mount point is being referred in databricks. How to achieve my goal here?
Mount points should be static, so you just need to have two different mount points pointing to the correct container, something like this:
/mnt/storage1_appdata
/mnt/storage2_appdata
so if you want your code be dynamic, use the f"/mnt/{storage_name}_appdata"
.
It's not recommended to dynamically remount containers - you can get cryptic errors when you remount mount point while somebody is reading/writing data using it.
Also, you can access ADLS directly if you specify correct configuration for your cluster/job (see doc) - you can even access both containers at the same time, just need to setup configuration for both storage accounts:
spark.conf.set("fs.azure.account.auth.type.<storage-account-name>.dfs.core.windows.net",
"OAuth")
spark.conf.set(
"fs.azure.account.oauth.provider.type.<storage-account-name>.dfs.core.windows.net",
"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set(
"fs.azure.account.oauth2.client.id.<storage-account-name>.dfs.core.windows.net",
"<application-id>")
spark.conf.set(
"fs.azure.account.oauth2.client.secret.<storage-account-name>.dfs.core.windows.net",
dbutils.secrets.get(scope="<scope-name>",key="<service-credential-key-name>"))
spark.conf.set(
"fs.azure.account.oauth2.client.endpoint.<storage-account-name>.dfs.core.windows.net",
"https://login.microsoftonline.com/<directory-id>/oauth2/token")