Due to the reason of my account service principle secret expired (concept: https://www.thedataswamp.com/blog/databricks-connect-to-azure-sql-with-service-principal), I need to remount all mount points with new service principle secret across all clusters.
First I have 3 clusters, clusterA , clusterB & clusterC; And I have 3 containers (as mount points), /mnt/001disk, /mnt/002disk, /mnt/003disk that all clusters mounting them.
So now I need to remount all mount points with below scripts:
storage_account_name = "aaaaaaaaaa"
tenant = 'xxxxxxxxxxxxxxx'
service_principal_id = "yyyyyy"
service_principal_secret = "zzzzzzzzzzzzzzzzzzz"
def func_mount_adls(source:str, mount_point:str):
print(source)
print(mount_point)
# declare necessary information for ADLS connection
configs = {
'fs.azure.account.auth.type': 'OAuth',
'fs.azure.account.oauth.provider.type': 'org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider',
'fs.azure.account.oauth2.client.id': service_principal_id,
'fs.azure.account.oauth2.client.secret': service_principal_secret,
'fs.azure.account.oauth2.client.endpoint': f'https://login.microsoftonline.com/{tenant}/oauth2/token'
}
mount_points = list(map(lambda mount: mount.mountPoint, dbutils.fs.mounts()))
# mount storage if not mounted yet
if not (mount_point in mount_points):
dbutils.fs.mount(
source = source,
mount_point = mount_point,
extra_configs = configs
)
# Databricks notebook source
import os
import re
import time
from datetime import datetime, timedelta
storage_account_name = os.environ['storage_account_name']
# COMMAND ---------
listToReMnt = [
"001disk",
"002disk",
"003disk"
]
def unmount_container(container_name):
print(f"unmount /mnt/{container_name}")
dbutils.fs.unmount(f"/mnt/{container_name}")
def mount_contianer(container_name):
return_value = {}
container_name = container_name
source = f'abfss://{container_name}@{storage_account_name}.dfs.core.windows.net/'
mount_point = f'/mnt/{container_name}'
mount_points = list(map(lambda mount: mount.mountPoint, dbutils.fs.mounts()))
print(f"mount /mnt/{container_name}")
print(mount_points)
# mount storage if not mounted yet
if not (mount_point in mount_points):
try:
func_mount_adls(source=source, mount_point=mount_point)
except Exception as e:
print(e)
for x in listToReMnt:
print("==============================================================")
unmount_container(x)
mount_contianer(x)
print('dbutils.fs.ls(f"/mnt/{x}")')
print(dbutils.fs.ls(f"/mnt/{x}"))
So here is my question, do I need to run the remount script in each cluster? Or after I run in 1 cluster, else will be updated accordingly? If so, how to ensure cluster security (e.g. system to access or AD to access)
Base on the documentation https://docs.databricks.com/dbfs/mounts.html, it mentioned that
When you create a mount point through a cluster, cluster users can immediately access the mount point. To use the mount point in another running cluster, you must run dbutils.fs.refreshMounts() on that running cluster to make the newly created mount point available for use.
Do this mean when I update 1 cluster (e.g. clusterA) , else will be automatically updated (e.g. clusterB & clusterC)?
If no, do this mean i need to remount those containers in every cluster manual? (seems bit stupid)
Note that terraform/similar tool are not considered due to requirement & limitation.
refreshMounts()
helps in applying the changes made to a mount point across all clusters.refreshMounts()
in all the clusters respectively.