Search code examples
environment-variablesazure-machine-learning-serviceazure-python-sdk

Environment variables not setting up in AZURE ML - Python


We generate an environment file programmatically, here is how the resultant file looks like:

    FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04

    RUN rm /bin/sh && ln -s /bin/bash /bin/sh
    RUN echo "source /opt/miniconda/etc/profile.d/conda.sh &&         conda activate" >> ~/.bashrc

    RUN echo $'channels:\n\
  - anaconda\n\
  - conda-forge\n\
  - defaults\n\
dependencies:\n\
  - python=3.8.10\n\
  - pip:\n\
      - azureml-sdk==1.50.0\n\
      - azureml-dataset-runtime==1.50.0\n\
      - azure-storage-blob\n\
      - numpy==1.23.5\n\
      - pandas==2.0.0\n\
      - scipy==1.5.2\n\
      - scikit-learn==1.2.2\n\
      - azure-eventgrid==4.9.0\n\
  - conda:\n\
      - conda=23.3.0' > conda_env.yml
    RUN source /opt/miniconda/etc/profile.d/conda.sh &&         conda activate &&         conda install conda &&         pip install cmake &&         conda env update -f conda_env.yml
    
ENV cluster_identity_name=clisyer-ide-name
ENV cluster_identity_id=1234567
ENV data_drift_event_topic_name=someName
ENV sa_name=someStorage

And the image builds successfully, the env vars are okay as I see in logs:

enter image description here

But, when I try to access this environment programmatically:

if environment_name in environments:
    restored_environment = environments[environment_name]
    logging.info('Found environment: %s:%s', restored_environment.name, restored_environment.version)

I see the output here which is correct name and correct version. But printing the environment variables returns this:

enter image description here

Only example env var is there and not the ones we set in the Docker file.

However, I see the environment definition after fetching the environment and I can see the JSON containing ENV definitions:

enter image description here

Am I doing something wrong when accessing the environment variables?


Solution

  • We ended up using custom docker images with ENV commands, saving the images to azure ACR, and then creating the azure environment using the ACR repo and registering that environment into the workspace.

    This way the ENV vars are backed into the image and are accessible whenever retrieved from ACR.

    def get_environemnt(**args):
        new_env = Environment.from_dockerfile(
                    environment_name,
                    dockerfile
                )
        restored_environment = new_env
        restored_environment.register(workspace
        return restored_environment
    
    environment_active_monitoring = get_environment(
            workspace=ws,
            environment_name=e.aml_env_name_active_monitoring, # type: ignore
            conda_dependencies_file=e.aml_env_active_monitoring_conda_dep_file, # type: ignore
            env_vars=env_vars,
            tag=e.docker_tag,
            create_new=e.rebuild_env_active_monitoring, # type: ignore
            gpu_accelerated=False)