Search code examples
pythonazureazure-machine-learning-service

Access data on AML datastore from training script


I am looking for a working example how to access data on a Azure Machine Learning managed data store from within a train.py script. I followed the instructions in the link and my script is able to resolve the datastore.

However, whatever I tried (as_download(), as_mount()) the only thing I always got was a DataReference object. Or maybe I just don't understand how actually read data from a file with that.

run = Run.get_context()
exp = run.experiment
ws = run.experiment.workspace

ds = Datastore.get(ws, datastore_name='mydatastore')
data_folder_mount = ds.path('mnist').as_mount()

# So far this all works. But how to go from here?

Solution

  • You can pass in the DataReference object you created as the input to your training product (scriptrun/estimator/hyperdrive/pipeline). Then in your training script, you can access the mounted path via argument. full tutorial: https://learn.microsoft.com/en-us/azure/machine-learning/service/tutorial-train-models-with-aml