Search code examples
azureenvironmentazureml-python-sdk

error when trying to create a scheduled job in azure ml


im getting this error when submitting a scheduled pipeline, Failed to submit job due to Exception: Response status code does not indicate success: 400 (Exception(s) encountered while validating environment definition. See inner exception for details. (). Microsoft.RelInfra.Common.Exceptions.ErrorResponseException: Exception(s) encountered while validating environment definition. See inner exception for details. (BaseImage, BaseDockerfile, or BuildContext must be set for Docker-based environments.) (Conda dependencies were not specified. Please make sure that conda dependencies are specified in your run configuration.). here is my code


from azureml.core import Experiment, ScriptRunConfig, Environment, Workspace
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.core.runconfig import RunConfiguration
from azureml.pipeline.core import Pipeline, ScheduleRecurrence, Schedule
import datetime
ws = Workspace.from_config()

cluster_name = "cpu-cluster"

try:
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    print(f'Found existing cluster, use it: {cluster_name}')
except ComputeTargetException:
    print(f"{cluster_name} not found")
env = Environment.from_conda_specification(name='job-env', file_path='job-env.yml')
from azureml.pipeline.steps import PythonScriptStep

# Assuming `compute_target` and `env` are already defined
script_step = PythonScriptStep(
    name="Run Python Script",
    script_name="search.py",
    compute_target=compute_target,
    source_directory="./",
    runconfig=env,
    allow_reuse=True
)

experiment = Experiment(workspace=ws, name='search_experiment2')

from azureml.core import Experiment
from azureml.pipeline.core import Pipeline

pipeline = Pipeline(workspace=ws, steps=[script_step])
published_pipeline = pipeline.publish(
    name="search_pipeline_2",
    description="Pipeline Description",
    continue_on_step_failure=False)
from azureml.pipeline.core import Schedule, ScheduleRecurrence

recurrence = ScheduleRecurrence(frequency="Day", interval=1)  # Define your recurrence pattern

schedule = Schedule.create(
    workspace=ws, 
    name="DailyScriptRun",
    description="Daily run of script",
    pipeline_id=published_pipeline.id,
    experiment_name="searchExperimentName",
    recurrence=recurrence)



and here is my env yml file

name: job-env
channels:
  - conda-forge
dependencies:
  - python=3.10
  - pip=21.2.4
  - pip:
    - aenum==3.1.15
    - aiohttp==3.9.3
    - aiosignal==1.3.1
    - annotated-types==0.6.0
    - anyio==4.3.0
    - appdirs==1.4.4
    - APScheduler==3.10.4
    - attrs==23.2.0
    - azure-ai-documentintelligence==1.0.0b2
    - azure-ai-formrecognizer==3.3.2
    - azure-ai-textanalytics==5.3.0
    - azure-common==1.1.28
    - azure-core==1.30.1
    - azure-identity==1.15.0
    - azure-keyvault-secrets==4.8.0
    - azure-search==1.0.0b2
    - azure-search-documents==11.6.0b2
    - azure-storage-blob==12.19.1
    - azure-storage-file-datalake==12.14.0
    - beautifulsoup4==4.12.3
    - bs4==0.0.2
    - case_conversion==2.1.0
    - certifi==2024.2.2
    - cffi==1.16.0
    - chardet==5.2.0
    - charset-normalizer==3.3.2
    - clipboard==0.0.4
    - colorama==0.4.6
    - colour==0.1.5
    - contourpy==1.2.0
    - cryptography==42.0.5
    - cursor==1.3.5
    - cycler==0.12.1
    - decorator==4.4.2
    - dill==0.3.8
    - distro==1.9.0
    - fonttools==4.49.0
    - frozenlist==1.4.1
    - fuzzywuzzy==0.18.0
    - gender-guesser==0.4.0
    - h11==0.14.0
    - html-text==0.5.2
    - httpcore==1.0.4
    - httpx==0.27.0
    - idna==3.6
    - imageio==2.34.0
    - imageio-ffmpeg==0.4.9
    - infi.systray==0.1.12
    - inflect==7.0.0
    - isodate==0.6.1
    - kiwisolver==1.4.5
    - lxml==5.1.0
    - matplotlib==3.8.3
    - maybe-else==0.2.1
    - mbstrdecoder==1.1.3
    - moviepy==1.0.3
    - msal==1.27.0
    - msal-extensions==1.1.0
    - msoffcrypto-tool==5.3.1
    - msrest==0.7.1
    - multidict==6.0.5
    - numpy==1.26.4
    - O365==2.0.34
    - oauthlib==3.2.2
    - office365==0.3.15
    - Office365-REST-Python-Client==2.5.6
    - olefile==0.47
    - openai==1.7.1
    - packaging==24.0
    - pandas==2.2.1
    - parsedatetime==2.6
    - pathmagic==0.3.14
    - pillow==10.2.0
    - portalocker==2.8.2
    - prettierfier==1.0.3
    - proglog==0.1.10
    - pycparser==2.21
    - pydantic==2.6.3
    - pydantic_core==2.16.3
    - pydub==0.25.1
    - pyinstrument==4.6.2
    - pyiotools==0.3.18
    - PyJWT==2.8.0
    - pymiscutils==0.3.14
    - PyMuPDF==1.23.25
    - PyMuPDFb==1.23.22
    - pyparsing==3.1.2
    - pypdf==4.1.0
    - PyPDF2==3.0.1
    - pyperclip==1.8.2
    - PyQt5==5.15.10
    - PyQt5-Qt5==5.15.2
    - PyQt5-sip==12.13.0
    - pysubtypes==0.3.18
    - python-dateutil==2.9.0.post0
    - python-docx==1.1.0
    - python-dotenv==1.0.1
    - pytz==2024.1
    - pywin32==306
    - readchar==4.0.5
    - regex==2023.12.25
    - requests==2.31.0
    - requests-oauthlib==1.4.0
    - Send2Trash==1.8.2
    - simplejson==3.19.2
    - six==1.16.0
    - sniffio==1.3.1
    - soupsieve==2.5
    - stringcase==1.2.0
    - tabulate==0.9.0
    - tenacity==8.2.3
    - tiktoken==0.6.0
    - tqdm==4.66.2
    - typepy==1.3.2
    - typing_extensions==4.10.0
    - tzdata==2024.1
    - tzlocal==5.2
    - urllib3==2.2.1
    - XlsxWriter==3.2.0
    - yarl==1.9.4

the error indicates that the problem in environment also what im trying to do is nothing related to ML but i want to use compute cluster because my script will take at least 1h of running , is this the best way to do it ?


Solution

  • According to this documentation you need to use RunConfiguration for runconfig parameter in PythonScriptStep.

    So, you use RunConfiguration like below code.

    script_step = PythonScriptStep(
        name="Run Python Script",
        script_name="search.py",
        compute_target=compute_target,
        source_directory="./",
        runconfig=RunConfiguration(conda_dependencies=CondaDependencies(conda_dependencies_file_path='job-env.yml')),
        allow_reuse=True
    )
    

    Refer above documentation for more about parameter in RunConfiguration.