I would like to migrate below spark scala code into databricks pyspark supported to check if my container already exists in or else I have mount the container, which we planned to keep it centralized for all the team members to use this code. kindly help us.
principal
val configs = Map(
"fs.azure.account.auth.type" -> "OAuth",
"fs.azure.account.oauth.provider.type" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id" -> dbutils.secrets.get(scope = "sample-scope", key = "sample--client-id"),
"fs.azure.account.oauth2.client.secret" -> dbutils.secrets.get(scope = "sample-scope", key = "sample-client-secret"),
"fs.azure.account.oauth2.client.endpoint" -> "https://login.microsoftonline.com/33333df5c2-953a-444444444/oauth2/token"
)
val adlsPath = "abfss://[email protected]/"
val mountPoint = "/mnt/containername"
if (dbutils.fs.mounts.map(mnt => mnt.mountPoint).contains(mountPoint)) {
println(mountPoint + " already mounted")
}
else {
println(mountPoint + " not mounted, mounting now")
try {
dbutils.fs.mount(
source = adlsPath,
mountPoint = mountPoint,
extraConfigs = configs)
}
catch {
case e: java.rmi.RemoteException => {
println("exception encountered while mounting " + adlsPath)
}
}
}
You can use the Python code below.
# configs1 = {
# "fs.azure.account.auth.type": "OAuth",
# "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
# "fs.azure.account.oauth2.client.id": dbutils.secrets.get(scope="sample-scope", key="sample-client-id"),
# "fs.azure.account.oauth2.client.secret": dbutils.secrets.get(scope="sample-scope", key="sample-client-secret"),
# "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/33333df5c2-953a-444444444/oauth2/token"
# }
configs2 = {
"fs.azure.account.key.jadls2.blob.core.windows.net": "acc_key"
}
adls_path = "wasbs://[email protected]/"
mount_point = "/mnt/jadls2"
if any(mount.mountPoint == mount_point for mount in dbutils.fs.mounts()):
print(mount_point + " already mounted")
else:
print(mount_point + " not mounted, mounting now")
try:
dbutils.fs.mount(
source=adls_path,
mount_point=mount_point,
extra_configs=configs2
)
except Exception as e:
if "RemoteException" in str(e):
print("Exception encountered while mounting " + adls_path)
dbutils.fs.ls(mount_point)
Here, I tried with an account key, but that's not recommended; you should use a service principal.
Output:
Refer to this article for more information.
Mount ADLS Gen2 or Blob Storage in Azure Databricks (microsoft.com)