I'm working in a notebook in Azure Machine Learning Studio and I'm using the following code block to instantiate a job using the command function.
from azure.ai.ml import command, Input, Output
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes
subscription_id = "<subscription_id>"
resource_group = "<resource_group>"
workspace = "<workspace>"
storage_account = "<storage_account>"
input_path = "<input_path>"
output_path = "<output_path>"
input_dict = {
"input_data_object": Input(
type=AssetTypes.URI_FILE,
path=f"azureml://subscriptions/{subscription_id}/resourcegroups/{resource_group}/workspaces/{workspace}/datastores/{storage_account}/paths/{input_path}"
)
}
output_dict = {
"output_folder_object": Output(
type=AssetTypes.URI_FOLDER,
path=f"azureml://subscriptions/{subscription_id}/resourcegroups/{resource_group}/workspaces/{workspace}/datastores/{storage_account}/paths/{output_path}",
)
}
job = command(
code="./src",
command="python 01_read_write_data.py -v --input_data=${{inputs.input_data_object}} --output_folder=${{outputs.output_folder_object}}",
inputs=input_dict,
outputs=output_dict,
environment="<asset_env>",
compute="<compute_cluster>",
)
returned_job = ml_client.create_or_update(job)
This runs successfully but with each run, if the code stored within the ./src
directory changes then a new copy is uploaded to the default blob storage account. I don't mind this, but with each run, the code is uploaded to a new container at the root of my blob storage account. Therefore my default storage account is getting cluttered with containers. I've read the docs for instantiating a command
object using the command()
function, but I see no parameter available to control where my ./src
code gets uploaded. Is there any way to control this?
You can not control where the code should upload but what you can do is directly giving the path of your storage account where you code is saved.
For the parameter code
in command
job function you can give
local path or "http:", "https:", or "azureml:" url pointing to a remote location.
When you give local path it tries to create container and upload the code,
so, i would recommend using https
.
If the you python code in src
folder than you upload it to any folder in your storage account and copy https
path.
Sample code
command_job = command(
code="https://jgsml28890137600.blob.core.windows.net/data/src/",
command="echo hi from command",
environment="azureml:docker-context-example-10:1",
timeout=10800
)
Note: Storage account should be associated with ml workspace, if not create a datastore linking you storage account and the http path.