Search code examples
pythonazuredockerazure-batch

Docker run python script cant find locally installed module


For context, this problem relates to a docker image that will be run using azure batch.

Here is the Dockerfile, in full:

FROM continuumio/miniconda3

ADD . /pipegen

ADD environment.yml /tmp/environment.yml
RUN conda env create -f /tmp/environment.yml

RUN echo "conda activate $(head -1 /tmp/environment.yml | cut -d' ' -f2)" >> ~/.bashrc

ENV PATH /opt/conda/envs/$(head -1 /tmp/environment.yml | cut -d' ' -f2)/bin:$PATH
ENV CONDA_DEFAULT_ENV $(head -1 /tmp/environment.yml | cut -d' ' -f2)

ADD classify.py /classify.py
RUN rm -rf /pipegen

pipgen is the local module (where the Dockerfile is located) that is installed using the environment.yml file. Here is the environment.yml file in full:

name: pointcloudz

channels:
  - conda-forge
  - defaults

dependencies:
  - python=3.7
  - python-pdal
  - entwine
  - matplotlib
  - geopandas
  - notebook
  - azure-storage-blob==1.4.0
  - pip:
    - /pipegen
    - azure-batch==6.0.0

For clarity, the directory structure looks like this:

Dockerfile
pipegen
  \__ __init__.py
  \__ pipegen.py
  \__ utils.py
classify.py
batch_containers.py
environment.yml
setup.py

The Dockerfile establishes the environment created using the environment.yml file as the default (conda) python environment when the container is run. Therefore, I can run the container interactively as follows:

docker run -it pdalcontainers.azurecr.io/pdalcontainers/pdal-pipelines

and, from inside the container, execute the classify.py script with some command line arguments, as follows:

python classify.py in.las out.las --defaults

and the script is executed as expected. However, when I run the following command, attempting to execute the very same script from "outside" the container,

docker run -it pdalcontainers.azurecr.io/pdalcontainers/pdal-pipelines python classify.py in.las out.las --defualts

I get the following error:

File "classify.py", line 2, in <module>
    from pipegen.pipegen import build_pipeline, write_las
ModuleNotFoundError: No module named 'pipegen'

Just to be clear, the classify.py script imports pipegen, the local module which is now installed in the conda environment created in the Dockerfile. I need to be able to execute the script using the docker run command above due to constraints in how Azure batch runs jobs. I've tried multiple fixes but am now pretty stuck. Any wisdom would be greatly appreciated!


Solution

  • The problem you are facing is because you added the conda activate to the .bashrc script which is only activated for login shells. When you run the container interactively, that is what you are getting. However, when you just try to invoke the python script directly, you do not get a login shell so your conda environment is not activated.

    One think you could do is not use the conda activate and instead run the script with conda run. To simplify the command-line, add this entrypoint to your Dockerfile:

    ENTRYPOINT ["conda", "run", "-n", "$CONDA_DEFAULT_ENV", "python", "classify.py"]

    Using this in the entrypoint also allows the caller to pass command-line arguments via docker run.

    From the Dockerfile reference

    Command line arguments to docker run will be appended after all elements in an exec form ENTRYPOINT

    For a more detailed explanation, see Activating a Conda environment in your Dockerfile