In my team, we currently use a single docker image for cross-building a SW library, where we add several cross build toolchains in one docker image (/opt/...). The Dockerfile and the toolchains are all in one git repository (toolchain archives via git-lfs).
The Dockerfile is convenient to have a comprehensive formal description of the build environment, which can be used on different machines.
Now we keep getting toolchains from customers, and sometimes new versions of those toolchains. Thus, our docker image is growing fast, and currently uses about 40GB. What's even worse is the time it takes to build a new docker image if we need to add/modify/remove a toolchain.
So I started doubting that we do things in a "good practice"-way.
Using a separate docker image for each toolchain would save time for building the huge single docker image each time. But it's even worse when it comes to space (because each docker image would contain the Linux subsystem with build tools etc).
I'd be happy about any hints how to do this in a better way.
Using a separate docker image for each toolchain would save time for building the huge single docker image each time. But it's even worse when it comes to space (because each docker image would contain the Linux subsystem with build tools etc).
This is mostly not true, due to a Docker feature called layers.
What are layers? Each time you have a command in a Dockerfile, that creates a new layer, consisting of the files which were changed since the creation of the last layer. Once created, layers are immutable.
This buys you two things.
Before, I said it was "mostly" not true. What's the caveat?
There is one way of decreasing the size of many different images. If you can identify a common dependency which many of them use, it can save space to extract that dependency into a "base image."
Imagine you have two Docker images, defined by the following Dockerfiles:
# Image A
FROM ubuntu:latest
RUN apt-get install -y foo
RUN apt-get install -y gcc
# Image B
FROM ubuntu:latest
RUN apt-get install -y bar
RUN apt-get install -y gcc
Here we have two images, and both of them install gcc. However, the two installations of gcc will create two layers, because Docker can't tell that they're identical. This is a waste of space.
What you can do is create a Dockerfile which defines a base image:
# Base image
FROM ubuntu:latest
RUN apt-get install -y gcc
Then, you run docker build -t my-cool-base-image .
Now, you can reference the base image like this:
# Image A
FROM my-cool-base-image:latest
RUN apt-get install -y foo
# No need to install gcc here
Now your two containers will share a single copy of gcc on-disk.