I have an Azure Data Factory pipeline that triggers a Databricks notebook. Inside this notebook, I have the following code to unmount / mount storage,
# Unmount and mount storage
mnt_point = "/mnt"
out_mnt_point = "/out_mnt"
# Unmount storage, if any
for mount in dbutils.fs.mounts():
if (mount.mountPoint == mnt_point):
dbutils.fs.unmount(mnt_point)
elif (mount.mountPoint == out_mnt_point):
dbutils.fs.unmount(out_mnt_point)
# Mount storage for input
dbutils.fs.mount(
source = f"wasbs://" + input_folder + "@xxx.blob.core.windows.net",
mount_point = mnt_point,
extra_configs = {f"fs.azure.account.key.xxx.blob.core.windows.net": azure_account_key }
)
# Mount storage for output
dbutils.fs.mount(
source = f"wasbs://" + output_folder + "@xxx.blob.core.windows.net",
mount_point = out_mnt_point,
extra_configs = {f"fs.azure.account.key.xxx.blob.core.windows.net": azure_account_key }
)
My question is that, if there are multiple instances of the pipeline are running concurrently, will this affect each other (e.g. one notebook is mounting and the other is unmounting and make the other process fail)? Or does each instance has it's own specific isolated resource?
Directory already mounted: /mnt/ip
.UPDATE:
Try using the following code:
# Unmount and mount storage
azure_account_key = 'SRzAYuN2/aRJuSdHkwSXxXIE3qpBl0ekvtVSQ4BKqFAi+z2SM86qrUM3rt5tD3s68m450n/aledC+AStTrzdBw=='
mnt_point = "/mnt/ip"
out_mnt_point = "/mnt/op"
ip_mount = 0
op_mount = 0
for mount in dbutils.fs.mounts():
if(ip_mount ==0 or op_mount ==0):
if (mount.mountPoint == mnt_point):
ip_mount+=1
elif (mount.mountPoint == out_mnt_point):
op_mount+=1
else:
if(ip_mount==0):
dbutils.fs.mount(
source = f"wasbs://" + input_folder + "@xxx.blob.core.windows.net",
mount_point = mnt_point,
extra_configs = {f"fs.azure.account.key.xxx.blob.core.windows.net": azure_account_key }
)
if(op_mount==0):
dbutils.fs.mount(
source = f"wasbs://" + output_folder + "@xxx.blob.core.windows.net",
mount_point = out_mnt_point,
extra_configs = {f"fs.azure.account.key.xxx.blob.core.windows.net": azure_account_key }
)
NOTE: The case is same for even simultaneous pipeline runs as well instead of simultaneous activity runs as demonstrated above.