Search code examples
pythondockerdocker-composejupyter-notebooksys

Python script not executed with docker compose


I have got the following Docker file

FROM jupyter/scipy-notebook

COPY . ./work

RUN pip install -r ./work/requirements.txt

WORKDIR /work 

CMD ["python", "./work/append_dependencies.py"]

and the following docker-compose.yml

version: '3.7'

networks:
  george:


services:
  jupyter:
    build: .
    image: jupyter/datascience-notebook:r-4.0.3
    environment:
      - JUPYTER_TOKEN=password
      - JUPYTER_ENABLE_LAB=yes

    volumes:
      - .:/home/jovyan/work
      - ./notebooks:/home/jovyan/work
      - ./src:/home/jovyan/work
    ports:
      - 7777:8888
    container_name: almond_analysis
    networks:
      - george

The project structure is:

almond_analysis:
    notebooks:
        data_exploration.ipynb
    src:
       __init__.py
       plots.py
    .gitignore
    docker-compose.yml
    Dockerfile
    README.md
    requirements.txt
    setup.py

The file append_dependencies.py looks like this:


import sys

sys.path.append("/home/jovyan/work")

What I would like to do is execute the file append_dependencies.py automatically after I type docker compose up -d --build. Right now, after I build and run my container (with docker compose up -d --build), the container crashes, i.e. after I type docker ps I get the following output:

CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

I tried changing CMD to ENTRYPOINT but the result was the same.

Does anyone know how to execute append_dependencies.py right after I build the container with docker compose up -d --build?


Solution

  • All Docker containers have a PID 0 process, or the root process of that container. As soon as that process ends, the container exits. In this case, the Jupyter Scipy Docker container comes with a default PID 0 process that starts up the notebook server. But in your Dockerfile, by specifying the append_dependencies file as your CMD/Entrypoint, Docker is using that as the PID 0 process. Thus, as soon as that process ends (after it executes the path append), the container exits (or crashes, as you say). You overwrote the default behavior.

    Also, because you are just running that script, the append wouldn't persist throughout anything else you do on that container. sys.path.append only applies to the current session being run. The easiest way to do this is to set the PYTHONPATH environment variable, which is read by all processes.

    Add this to your Dockerfile (above the entrypoint):

    ENV PYTHONPATH "${PYTHONPATH}:/home/jovyan/work"
    

    Then, remove the ENTRYPOINT/CMD from your code (using the default one in the image).

    Hope this helps!