Search code examples
pythondockertensorflowpipenvpython-poetry

Partial dependency management in python


I am managing a rather big python project, with lots of dependencies, that is intended to be run within the TensorFlow docker container. A common way of defining which dependencies are to be installed in production, is though a lock file generated by a tool like Pipenv or Poetry. When creating such a lock file you usually specify all python dependencies in order to ensure no conflicts between packages. But since the TensorFlow docker container comes preinstalled with TensorFlow and all of its dependencies, I would really like to have those packages excluded from my lock file to avoid double installs. I still, however, want my dependency management tool to account for the presence of a certain version of TensorFlow when resolving dependencies for the lock file.

Is there a way to generate lock files that account for preinstalled packages in the environment without having them included in the lock file?


Solution

  • If you're installing your packages into the TensorFlow Docker image, then the TensorFlow dependency is already "locked" by the image name and tag, e.g. tensorflow/tensorflow:2.0.0. So specify your other Python dependencies in your Pipfile.

    For example, your Dockerfile could contain:

    FROM tensorflow/tensorflow:2.0.0-py3
    
    RUN pip3 install pipenv
    COPY Pipfile Pipfile.lock /yourproject
    WORKDIR /yourproject
    
    RUN pipenv --three --site-packages
    RUN pipenv install 
    

    Then you have TensorFlow and all your other dependencies:

    $ docker build . -t yourproject && docker run -it yourproject bash
    # build info not shown
    root@b04fc204d239:/yourproject# pipenv run python -c "import tensorflow; print(tensorflow.__version__)"
    Loading .env environment variables…
    2.0.0