Search code examples
pythondockerdockerfile

How to do a clean install of python from source in a Docker container? (Image gets very large)


Currently I have to create Docker images that build python from source (for example we do need two different python versions in a container, one python version for building and one for testing the application, also we need to exactly specify the python version we want to install and newer versions are not supported via apt install for example).

My Problem is a.t.m. that the size of the image gets really large if you build python from source and yet I do not fully understand why.

Let's take the following image as an example:

# we start with prebuild python image to set system python to 3.13
FROM WWW.SOMEURL.COM/python:3.13-slim-bullseye  

# now we install the build dependencies required to build python from source
RUN apt update -y &&\
    apt upgrade -y &&\
    apt-get install --no-install-recommends --yes \
        build-essential \
        zlib1g-dev \
        libncurses5-dev \
        libgdbm-dev \
        libnss3-dev \
        libssl-dev \
        libreadline-dev \
        libffi-dev \
        libsqlite3-dev \
        libbz2-dev \
        git \
        wget &&\
    apt-get clean

# next we altinstall another python version by building it from source
RUN   cd /usr/src &&\
      wget "https://www.python.org/ftp/python/3.11.11/Python-3.11.11.tgz" &&\
      tar xzf "Python-3.11.11.tgz" &&\
      cd "Python-3.11.11" &&\
      ./configure &&\
      make altinstall

# finally we remove the build dependencies to safe some space
RUN apt-get remove --purge  -y \
        build-essential \
        zlib1g-dev \
        libncurses5-dev \
        libgdbm-dev \
        libnss3-dev \
        libssl-dev \
        libreadline-dev \
        libffi-dev \
        libsqlite3-dev \
        libbz2-dev \
        git \
        wget &&\
    apt-get autoremove --purge -y &&\
    apt-get autoclean -y

# verify installation
RUN echo "DEBUG: Path to alt python: $(which python3.11) which has version $(python3.11 --version)"

For me this process results in a very large image, while the python installation itself should not be that large (~150-200 MB on a local machine). However, it seems like the pure installation of python from source adds around 800MB to the image. Why is this the case?

Thank you for your help!


New Dockerfile according to answers, that greatly reduces (~50%) the final size of the image:

# we start with prebuild python image to set system python to 3.13, if you dont need that you can just use any other image and perform the same steps (maybe swap altinstall to install)
FROM WWW.SOMEURL.COM/python:3.13-slim-bullseye  

# install and remove build dependencies in a single stage
RUN bash install_build_deps.sh && \
    bash altinstall_python.sh && \
    bash remove_build_deps.sh

# verify installation
RUN echo "DEBUG: Path to alt python: $(which python3.11) which has version $(python3.11 --version)"

Script install_build_deps.sh (addition of removing /var/lib/apt/lists/*):

apt-get update -y
apt-get upgrade -y
apt-get install --no-install-recommends --yes
   build-essential
   zlib1g-dev
   libncurses5-dev
   libgdbm-dev
   libnss3-dev
   libssl-dev
   libreadline-dev
   libffi-dev
   libsqlite3-dev
   libbz2-dev
   wget
rm -rf /var/lib/apt/lists/*
apt-get clean

Script altinstall_python.sh (delete tarball and added files to /usr/local/src):

cd "/usr/local/src"
wget "https://www.python.org/ftp/python/3.11.11/Python-3.11.11.tgz"
tar xzf "Python-3.11.11.tgz"
cd "Python-3.11.11"
./configure
make altinstall
rm "Python-3.11.11.tgz"
rm -r Python-3.11.11

Script remove_build_deps.sh:

apt-get remove --purge  -y \
   build-essential \
   zlib1g-dev \
   libncurses5-dev \
   libgdbm-dev \
   libnss3-dev \
   libssl-dev \
   libreadline-dev \
   libffi-dev \
   libsqlite3-dev \
   libbz2-dev \
   wget &&\
apt-get autoremove --purge -y &&\
apt-get autoclean -y

Thanks a lot for the help, if there are further optimizations, let me know and I will update this, if somebody wants to use it as a reference.


Solution

  • Research and read dockerfile best practices, for example https://docs.docker.com/build/building/best-practices/#apt-get . Remove src directory and any build aftefacts after you are done installing. Remove packages in the same stage as you install them.

    Additionally, you might be interested in pyenv project that streamlines python compilation.

    Do not use /usr/src for your stuff, it's a system directory. Research linux FHS. I usually use home directory in docker, but i guess /usr/local/src looks also fine.