Search code examples
endpointazure-machine-learning-service

Azure ML Online Endpoint deployment DriverFileNotFound Error


When running the Azure ML Online endpoint commands, it works locally. But when I try to deploy it to Azure I get this error. Command - az ml online-deployment create --name blue --endpoint "unique-name" -f endpoints/online/managed/sample/blue-deployment.yml --all-traffic

{
    "status": "Failed",
    "error": {
        "code": "DriverFileNotFound",
        "message": "Driver file with name score.py not found in provided dependencies. Please check the name of your file.",
        "details": [
            {
                "code": "DriverFileNotFound",
                "message": "Driver file with name score.py not found in provided dependencies. Please check the name of your file.\nThe build log is available in the workspace blob store \"coloraiamlsa\" under the path \"/azureml/ImageLogs/1673692e-e30b-4306-ab81-2eed9dfd4020/build.log\"",
                "details": [],
                "additionalInfo": []
            }
        ],
        

This is the deployment YAML taken straight from azureml-examples repo

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: blue
endpoint_name: my-endpoint
model:
  local_path: ../../model-1/model/sklearn_regression_model.pkl
code_configuration:
  code: 
    local_path: ../../model-1/onlinescoring/
  scoring_script: score.py
environment: 
  conda_file: ../../model-1/environment/conda.yml
  image: mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:20210727.v1
instance_type: Standard_F2s_v2
instance_count: 1

Solution

  • Finally after lot of head banging, I have been able to consistently repro this bug in another Azure ML Workspace.

    I tried deploying the same sample in a brand new Azure ML workspace created and it went smoothly.

    At this point I remembered that I had upgraded the Storage Account of my previous AML Workspace to DataLake Gen2.

    So I did the same upgrade in this new workspace’s storage account. After the upgrade, when I try to deploy the same endpoint, I get the same DriverFileNotFoundError!

    It seems Azure ML does not support Storage Account with DataLake Gen2 capabilities although the support page says otherwise. https://learn.microsoft.com/en-us/azure/machine-learning/how-to-access-data#supported-data-storage-service-types.

    At this point my only option is to recreate a new workspace and deploy my code there. Hope Azure team fixes this soon.