Search code examples
pythonazure-machine-learning-service

Failure of experiment when using base_dockerfile instead of base_image


I am attempting to submit an experiment to the Azure Machine Learning Service using a custom docker image. Everything works ok when I provide the docker image, but fails if I choose to provide a dockerfile.

The use of a base_dockerfile in the DockerSection object is documented here and was added in v1.0.53 of the sdk (as noted here)

Example code:

ds = DockerSection()
ds.enabled = True
ds.base_dockerfile = "FROM ubuntu:latest RUN echo 'Hello world!'"
ds.base_image = None

The rest of the code is the same as when running with a predefined image from the registry (e.g. setting base_image in the above code).

Example error from ML service is:

raise ActivityFailedException(error_details=json.dumps(error, indent=4)) azureml.exceptions._azureml_exception.ActivityFailedException: ActivityFailedException: Message: Activity Failed: { "error": { "code": "ServiceError", "message": "InternalServerError", "details": [] }, "correlation": { "operation": null, "request": "K/C4FSnEz74=" }, "environment": "southcentralus", "location": "southcentralus", "time": "2019-08-20T16:33:17.130928Z" } InnerException None ErrorResponse {"error": {"message": "Activity Failed:\n{\n \"error\": {\n \"code\": \"ServiceError\",\n
\"message\": \"InternalServerError\",\n \"details\": []\n
},\n \"correlation\": {\n \"operation\": null,\n
\"request\": \"K/C4FSnEz74=\"\n },\n \"environment\": \"southcentralus\",\n \"location\": \"southcentralus\",\n
\"time\": \"2019-08-20T16:33:17.130928Z\"\n}"}}

I've used an example dockerfile in the code above (taken from the SDK documentation) but get the same error if I use the dockerfile that created the base image that works ok from the registry.

Any ideas - or pointers to samples where this actually works - appreciated!


Solution

  • Thanks for reporting this issue! This appears to be a bug that our team is investigating.