Search code examples
python-3.xazureazure-machine-learning-service

Why does Azure ML Studio (classic) take additional time to execute Python Scripts?


I have been working with ML Studio (classic) and facing a problem with "Execute Python" scripts. I have noticed that it takes additional time to perform some internal tasks after which it starts executing the actual Python code in ML Studio. This delay has caused an increased time of 40-60 seconds per module which is aggregating and causing a delay of 400-500 seconds per execution when consumed through Batch Execution System or on running the experiments manually. (I've multiple Modules of "Execute Python" scripts)

For instance - If I run a code in my local system, suppose it takes 2-3 seconds. The same would consume 50-60 seconds in Azure ML Studio.

Can you please help understand the reason behind this or any optimization that can be done?

Regards, Anant


Solution

  • The known limitations of Machine Learning Studio (classic) are:

    The Python runtime is sandboxed and does not allow access to the network or to the local file system in a persistent manner.

    All files saved locally are isolated and deleted once the module finishes. The Python code cannot access most directories on the machine it runs on, the exception being the current directory and its subdirectories.

    When you provide a zipped file as a resource, the files are copied from your workspace to the experiment execution space, unpacked, and then used. Copying and unpacking resources can consume memory.

    The module can output a single data frame. It's not possible to return arbitrary Python objects such as trained models directly back to the Studio (classic) runtime. However, you can write objects to storage or to the workspace. Another option is to use pickle to serialize multiple objects into a byte array and then return the array inside a data frame.

    Hope this helps!