Search code examples
dockerdocker-imagedocker-build

Docker specific images not reusing base image layers?


I use docker buildx bake to build images. And I have one base image that is used for all specific images. That base image has most layers that take most space.

I was expecting that building images this way, it would reuse those layers and total size would be base_image_size + image1_size + image2_size...

What looks like happening is this: N*base_image_size + image1_size + image2_size... where N is the number of specific images.

So I quickly ran out of space.

enter image description here

enter image description here

So if you would look at sizes of each image, the one with 1.97Gb is base image. All others take 2.37GB (also some of those are same images with different tags). So I would assume that 1.97Gb would be shared across all images at least?

Is there a way to optimize the way layers are used, so same layers are not multiplied (if that is what is happening here).

My bake file looks like this:

target "base" {
  dockerfile = "src/base/Dockerfile"
  contexts = {
    base-src = "src/base"
  }
  tags = ["ghcr.io/myorg/base:${BASE_TAG}"]
}
target "image1" {
  inherits = ["_all"]
  contexts = {
    project-src = "src/projects/image1"
  }
  dockerfile = "${PROJECT_DOCKERFILE}"
  tags = ["ghcr.io/myorg/image1:${PROJECT_TAG}"]
}

using ncdu, I can see that base image files are mostly in overlay, like:

enter image description here

enter image description here

Not sure how that overlay2 displays things, but to me it looks like its just duplicating in size. As same layers are not reused (or it outputs misleading data?)

P.S. I also use cron job @hourly docker system prune -af --filter until=24h But majority of space is still not removed.

Update

This is base Dockerfile:

#syntax=docker/dockerfile:1.4
# 22.04
FROM ubuntu@sha256:a8fe6fd30333dc60fc5306982a7c51385c2091af1e0ee887166b40a905691fd0

SHELL ["/bin/bash", "-xo", "pipefail", "-c"]

# Args.
ARG ODOO_GROUP=odoo
ARG ODOO_USER=$ODOO_GROUP
ARG ODOO_HOME_PATH="/opt/$ODOO_USER"
# ENVs.
ENV LANG=C.UTF-8
# Path for all requirements
ENV ODOO_REQS_DIR_PATH="$ODOO_HOME_PATH/requirements"
ENV ODOO_REQS_FILE_PATH="$ODOO_REQS_DIR_PATH/requirements.txt"
ENV ODOO_RC_DIR_PATH=/etc/odoo
# ODOO_RC is expected from standard Odoo.
ENV ODOO_RC="$ODOO_RC_DIR_PATH/odoo.conf"
ENV ODOO_DATA_PATH="$ODOO_HOME_PATH/data"
ENV ODOO_ADDONS_ROOT_PATH="$ODOO_HOME_PATH/projects"
ENV ODOO_PATH="$ODOO_HOME_PATH/odoo"
ENV VENV_PATH="$ODOO_HOME_PATH/venv"
ENV PATH="$VENV_PATH/bin:$PATH"
# For postgres 15.
COPY --from=base-src ./keys/pgdg.asc /etc/apt/trusted.gpg.d/pgdg.asc
# Install global packages.
RUN echo "deb http://apt.postgresql.org/pub/repos/apt jammy-pgdg main" > /etc/apt/sources.list.d/pgdg.list \
    && apt update \
    && DEBIAN_FRONTEND=noninteractive apt install --no-install-recommends -y \
        # For convenience
        nano \
        ca-certificates \
        dirmngr \
        curl \
        fonts-noto-cjk \
        node-less \
        npm \
        libldap2-dev \
        libpq-dev \
        libsasl2-dev \
        # Needed for gcc
        build-essential \
        gcc \
        # To be able to kill PIDs by port.
        psmisc \
        python3-dev \
        python3.10-dev \
        python3-pip \
        python3.10-venv \
        postgresql-client-15 \
    && npm install -g rtlcss \
    # Odoo recommended 12.5 no longer works with newer Ubuntu.
    && curl -o wkhtmltox.deb -sSL https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6.1-2/wkhtmltox_0.12.6.1-2.jammy_amd64.deb \
    && echo 'ee88d74834bdec650f7432c7d3ef1c981e42ae7a762a75a01f7f5da59abc18d5 wkhtmltox.deb' | sha256sum -c - \
    && apt-get install -y --no-install-recommends ./wkhtmltox.deb \
    && rm -rf /var/lib/apt/lists/* wkhtmltox.deb \
    && apt clean \
    # Set up users, directories and files.
    && adduser --system --home $ODOO_HOME_PATH --quiet --group $ODOO_USER \
    # Relate odoo user with postgres.
    && postgres -c "createuser -d -R -S $ODOO_USER" 2> /dev/null || true \
    # Mount dirs: data, configs.
    && mkdir -p $ODOO_DATA_PATH \
    && chown $ODOO_USER:$ODOO_GROUP $ODOO_DATA_PATH \
    && mkdir -p $ODOO_RC_DIR_PATH \
    && chown $ODOO_USER:$ODOO_GROUP $ODOO_RC_DIR_PATH \
    # Requirements.
    && mkdir -p $ODOO_REQS_DIR_PATH \
    && chown $ODOO_USER:$ODOO_GROUP $ODOO_REQS_DIR_PATH \
    # Projects
    && mkdir -p $ODOO_ADDONS_ROOT_PATH \
    && chown $ODOO_USER:$ODOO_GROUP $ODOO_ADDONS_ROOT_PATH \
    # Other
    && mkdir -p /entrypoint \
    && chown $ODOO_USER:$ODOO_GROUP /entrypoint \
    && mkdir -p "$ODOO_HOME_PATH/tmp" \
    && chown $ODOO_USER:$ODOO_GROUP "$ODOO_HOME_PATH/tmp"

COPY --from=base-src --chown=$ODOO_USER:$ODOO_GROUP ./odoo/requirements.txt "$ODOO_REQS_DIR_PATH/odoo-requirements.in"
COPY --from=base-src --chown=$ODOO_USER:$ODOO_GROUP ./requirements.txt "$ODOO_REQS_DIR_PATH/requirements.in"

# Make it available to mount.
VOLUME $ODOO_DATA_PATH

# Expose Odoo services
EXPOSE 8069 8071 8072

USER $ODOO_USER

COPY --from=base-src --chown=$ODOO_USER:$ODOO_GROUP ./scripts/pip-compile-install /opt/odoo/venv/bin/pip-compile-install

# Set up virtualenv and install Odoo requirements.
RUN python3 -m venv $VENV_PATH \
    && pip3 install --no-cache-dir pip-tools==6.13.0 wheel \
    && pip-compile-install $ODOO_REQS_DIR_PATH

COPY --from=base-src --chown=$ODOO_USER:$ODOO_GROUP ./odoo "$ODOO_PATH"
COPY --from=base-src --chown=$ODOO_USER:$ODOO_GROUP ./bins "$ODOO_HOME_PATH/bins"

# Installing odoo from directory for packages that depend on odoo!
RUN pip3 install --no-cache-dir -e $ODOO_PATH \
    # Using precompiled package, because pypi version is outdated and we
    # don't want to install git here just for this package.
    && pip3 install --no-cache-dir "$ODOO_HOME_PATH/bins/anthem-0.13.1.dev33+gcf73513-py2.py3-none-any.whl"
COPY --from=base-src --chown=odoo:odoo ./songs /opt/odoo/songs
RUN pip3 install --no-cache-dir -e /opt/odoo/songs
# Copying again after odoo dir is installed, to replace their odoo binary in
# favor of ours (default binary requires distribution version specified
# and it can disappear if we mount whole odoo src code from host).
COPY --from=base-src --chown=$ODOO_USER:$ODOO_GROUP ./scripts/odoo /opt/odoo/venv/bin/odoo
COPY --chown=$ODOO_USER:$ODOO_GROUP ./pytest.ini /opt/odoo/pytest.ini
COPY --from=base-src --chown=$ODOO_USER:$ODOO_GROUP ./entrypoint.py entrypoint.py
# Common addons for all projects.
COPY --from=base-src --chown=$ODOO_USER:$ODOO_GROUP ./addons "$ODOO_ADDONS_ROOT_PATH/base"
ENTRYPOINT ["/entrypoint.py"]
CMD ["odoo"]

This is specific image dockerfile using base image:

#syntax=docker/dockerfile:1.4
FROM base

COPY --from=project-src --chown=odoo:odoo ./requirements.txt "$ODOO_REQS_DIR_PATH/custom-requirements.in"
COPY --from=extra-src --chown=odoo:odoo ./connector/requirements.txt "$ODOO_REQS_DIR_PATH/connector-requirements.in"

RUN pip-compile-install $ODOO_REQS_DIR_PATH

COPY --from=project-src --chown=odoo:odoo ./custom_songs /opt/odoo/custom_songs

RUN pip3 install --no-cache-dir -e /opt/odoo/custom_songs

COPY --from=project-src --chown=odoo:odoo ./odoo.yml /opt/odoo/odoo.yml
COPY --from=project-src --chown=odoo:odoo ./addons/ "$ODOO_ADDONS_ROOT_PATH/custom"
COPY --from=extra-src --chown=odoo:odoo ./ "$ODOO_ADDONS_ROOT_PATH/"

Images are built running docker buildx bake image1 image2 ...

Update 2

I was building images using docker buildx bake image1, image2..., but I was not specifying to build base image explicitly. For example, you can also build it with docker buildx bake base, image1, image2.... Specific images are built either way.

When inspecting images I can see that specific images share same layers, but there is no base image. So does that mean, that even if they share same layers, but base image is nowhere to be found, those layers will be treated as if they are different? Cause it looks that way.

For example image1 has layers:

"Layers": [                
    "sha256:17f623af01e277c5ffe6779af8164907de02d9af7a0e161662fc735dd64f117b",      
       "sha256:b528ddfcb77d58b12bdd10a7f3569280e413accf0e9f038f9620326eaf2ffac9",           
"sha256:bc50e18bc14f08ede26fa41677dd441d45e27599d0adcd0da08ba28d5ea13365",
 
    "sha256:some-specific-layer1",            
]

Image 2 has layers

"Layers": [
                "sha256:17f623af01e277c5ffe6779af8164907de02d9af7a0e161662fc735dd64f117b",
                "sha256:b528ddfcb77d58b12bdd10a7f3569280e413accf0e9f038f9620326eaf2ffac9",
                "sha256:bc50e18bc14f08ede26fa41677dd441d45e27599d0adcd0da08ba28d5ea13365",
"sha256:some-specific-layer-2",            
]

Base image does not exist (or locally I had old base image, where only few layers matched), only its layers on specific images.

Solution

  • The problem was that I changed the way images were built. Instead of building all images in single job, each specific image was built separately. And then base image was not explicitly specified to build, so all layers (except one which was reused from old base image that was still pushed before) would be different on each image, even if code was the same when building all those images.

    After I changed to build all specific images in single job (together), now all common layers are properly reused.