Search code examples
dockerpippython-3.8requirements.txtkaniko

PIP3 installs a lot of other folders too when using requirements.txt making the build fail due to disk space issues


I'm trying to set up a Docker container (RHEL8) with Kaniko. IN the Dockerfile I specified to install Python3.8 and PIP3 to install Python libraries that were requested for the specific container. requirements.txt lists about 9 libraries (joblib, nltk, numpy, pandas, scikit-learn, scipy, spacy, torch, transformers), from some of which are quite large in size (for example Torch: 890M) but then, when I run

RUN python3.8 -m pip install -r requirements.txt

it runs through the requirements.txt from top to bottom, downloads them but then after the last line it also downloads a lot of other folders/packages too, some quite huge in size, like:

nvidia-cublas-cu11 : 317M
nvidia-cudnn-cu11 : 557M

It installs a lot of packages, like: MarkupSafe, blis, catalogue, certifi, charset-normalizer, click, confection, cymem, filelock, huggingface-hub, idna, jinja, langcodes, murmurash, etc.. and the list is quite impressive

I had to increase the disk size of the runner with 6G in order to even cope with the increased amount of downloaded stuff, but the build still fails upon Taking a snapshot of the full filesystem, due to running out of free disk space.

I have increased free disk space from 8G to 10G and then as a second attempt, to 14G, but the build still fails. I have also specified --single-snapshot option for Kaniko to only take one single snapshot at the end, and not creating separate snapshots at every step (RUN, COPY). I have installed an Nvidia driver to the container, for which I picked a quite lightweight one (450.102.04) which should not take up too much space either.

My question is: are the packages installed by pip3 after installing the list specified in requirements.txt basically dependencies, that I still must install, or are those optional?

Is there any option to overcome this excessing disk space issue? When I start the build process (via GitLab CI - Kaniko) the available free space on the xfs is 12G from 14G, so I should be enough, but the build fails with exit code 1 and message: "no space left on drive"


Solution

  • I got into the exact same problem you did, where my RUN pip install -r requirements.txt --no-cache-dir stage would simply not finish. Similarly to you, I was also including a torch package in my requirements.txt and even after installation of all the packages, new ones would keep on popping up and installing including these ones you have stated:

    nvidia-cublas-cu11 : 317M
    nvidia-cudnn-cu11 : 557M
    

    I was able to solve the issue by removing the torch package from my requirement.txt file and installing it manually with another RUN pip install command in the dockerfile. Here is my updated dockerfile:

    FROM python:3.8
    
    WORKDIR /app
    
    COPY requirement.txt requirement.txt
    RUN pip3 install -r requirement.txt --no-cache-dir
    RUN pip3 install https://download.pytorch.org/whl/cpu/torch-1.7.1%2Bcpu-cp38-cp38-linux_x86_64.whl
    COPY . .
    
    EXPOSE 5000
    CMD [ "python", "app.py"]
    

    To my relief, the docker build I ran using this dockerfile did not take longer than 5 minutes, with my requirements.txt containing around 40 packages. I am hardly an expert, and I suppose that once the initial set of packages were installed by requirements.txt, then there were no complications (such as chain of dependencies etc) and the next pip install for torch ran seamlessly. Hoping someone else could better explain this. Oh, and I used a CPU version of torch as you may have noticed in the dockerfile, which may have also led to those 2 packages mentioned above not installing. Hope it helps.