I have registered a datastore which is an ADLS.
datastore = mlclient.datastores.get(ds_name)
from azureml.fsspec import AzureMachineLearningFileSystem
#azureml://subscriptions/<subid>/resourcegroups/<rgname>/workspaces/<workspace_name>/datastore/datastorename
ds_url = f"azureml://subscriptions/{subscriptionID}/resourcegroups/{RG}/workspaces/{ws_name}/datastore/adls/paths/iris-processed/*"
fs = AzureMachineLearningFileSystem(ds_url)
fs.ls()
I am getting the following error even if I use datastore.id:
ValueError: azureml://subscriptions/xx/resourcegroups/xx/workspaces/xx/datastore/adls/paths/iris-processed/* is not a valid datastore uri: azureml://subscriptions/([^\/]+)/resourcegroups/([^\/]+)/(?:Microsoft.MachineLearningServices/)?workspaces/([^\/]+)/datastores/([^\/]+)/paths/(.*)
ValueError:azureml://subscriptions/xx/resourcegroups/xx/workspaces/xx/datastore/adls/paths/iris-processed/* is not a valid datastoreuri:azureml://subscriptions/([^/]+)/resourcegroups/([^/]+)/(?:Microsoft.MachineLearningServices/)workspaces/([^/]+)/datastores/([^/]+)/paths/(.*)
The above error occurs when you pass the wrong parameters in the URI like (Susbcriptionid, Resource group, Workspace name, Datastore name, and path).
I tried with proper parameters in the Uri with the same code and got the expected results.
Code:
from azureml.fsspec import AzureMachineLearningFileSystem
subscription_id = 'Subscription-id'
resource_group = 'Your-resource-group'
workspace_name = 'Workspacename'
input_datastore_name = 'datastore1'
path_on_datastore = 'folder1/'
#azureml://subscriptions/<subid>/resourcegroups/<rgname>/workspaces/<workspace_name>/datastore/datastorename
ds_url = f'azureml://subscriptions/{subscription_id}/resourcegroups/{resource_group}/workspaces/{workspace_name}/datastores/{input_datastore_name}/paths/{path_on_datastore}'
fs = AzureMachineLearningFileSystem(ds_url)
f_list = fs.ls()
print(f_list)
Output:
['datastore1/folder1/09-05-2023 (1).html', 'datastore1/folder1/09-05-2023.html', 'datastore1/folder1/10-05-2023.html', 'datastore1/folder1/10-05=2023.html', 'datastore1/folder1/11-05-2023.html', 'datastore1/folder1/12-05-2023 (1).html', 'datastore1/folder1/12-05-2023.html', 'datastore1/folder1/timezone.csv']
Reference: Is there a way to get list of folders from a datastore in Azure ML studio with Python SDK v2 - Stack Overflow by khemanth958.