Search code examples
pythonazureazure-machine-learning-service

How do I use an environment in an ML Azure Pipeline


Background

I have created an ML Workspace environment from a conda environment.yml plus some docker config and environment variables. I can access it from within a Python notebook:

env = Environment.get(workspace=ws, name='my-environment', version='1')

I can use this successfully to run a Python script as an experiment, i.e.

runconfig = ScriptRunConfig(source_directory='script/', script='my-script.py', arguments=script_params)
runconfig.run_config.target = compute_target
runconfig.run_config.environment = env
run = exp.submit(runconfig)

Problem

I would now like to run this same script as a Pipeline, so that I can trigger multiple runs with different parameters. I have created the Pipeline as follows:

pipeline_step = PythonScriptStep(
    source_directory='script', script_name='my-script.py',
    arguments=['-a', param1, '-b', param2],
    compute_target=compute_target,
    runconfig=runconfig
)
steps = [pipeline_step]
pipeline = Pipeline(workspace=ws, steps=steps)
pipeline.validate()

When I then try to run the Pipeline:

pipeline_run = Experiment(ws, 'my_pipeline_run').submit(
    pipeline, pipeline_parameters={...}
)

I get the following error: Response status code does not indicate success: 400 (Conda dependencies were not specified. Please make sure that all conda dependencies were specified i).

When I view the pipeline run in the Azure Portal it seems that the environment has not been picked up: none of my conda dependencies are configured, hence the code does not run. What am I doing wrong?


Solution

  • You're almost there, but you need to use RunConfiguration instead of ScriptRunConfig. More info here

    from azureml.core.runconfig import RunConfiguration
    
    env = Environment.get(workspace=ws, name='my-environment', version='1')
    # create a new runconfig object
    runconfig = RunConfiguration()
    runconfig.environment = env
    
    pipeline_step = PythonScriptStep(
        source_directory='script', script_name='my-script.py',
        arguments=['-a', param1, '-b', param2],
        compute_target=compute_target,
        runconfig=runconfig
    )
    
    pipeline = Pipeline(workspace=ws, steps=[pipeline_step])
    
    pipeline_run = Experiment(ws, 'my_pipeline_run').submit(pipeline)