Using docker with many cross build toolchains

In my team, we currently use a single docker image for cross-building a SW library, where we add several cross build toolchains in one docker image (/opt/...). The Dockerfile and the toolchains are all in one git repository (toolchain archives via git-lfs).

The Dockerfile is convenient to have a comprehensive formal description of the build environment, which can be used on different machines.

Now we keep getting toolchains from customers, and sometimes new versions of those toolchains. Thus, our docker image is growing fast, and currently uses about 40GB. What's even worse is the time it takes to build a new docker image if we need to add/modify/remove a toolchain.

So I started doubting that we do things in a "good practice"-way.

Using a separate docker image for each toolchain would save time for building the huge single docker image each time. But it's even worse when it comes to space (because each docker image would contain the Linux subsystem with build tools etc).

I'd be happy about any hints how to do this in a better way.

Are there any "established" ways to handle such a scenario?
How do other teams handle this use-case?

Solution

Layers

Using a separate docker image for each toolchain would save time for building the huge single docker image each time. But it's even worse when it comes to space (because each docker image would contain the Linux subsystem with build tools etc).

This is mostly not true, due to a Docker feature called layers.

What are layers? Each time you have a command in a Dockerfile, that creates a new layer, consisting of the files which were changed since the creation of the last layer. Once created, layers are immutable.

This buys you two things.

The first thing it gets you is that you can have "build caching." If you change the last command in a Dockerfile, and re-run the entire thing, only the last command will need to be re-run. Everything else will be gotten from cache.
The second thing it accomplishes is that if multiple docker images have the same layers, they can share those layers on disk. The first time you use an image based upon Ubuntu, it will cost you a few gigabytes. The second time, it will cost nothing.

Building a custom base image

Before, I said it was "mostly" not true. What's the caveat?

There is one way of decreasing the size of many different images. If you can identify a common dependency which many of them use, it can save space to extract that dependency into a "base image."

Imagine you have two Docker images, defined by the following Dockerfiles:

# Image A
FROM ubuntu:latest
RUN apt-get install -y foo
RUN apt-get install -y gcc

# Image B
FROM ubuntu:latest
RUN apt-get install -y bar
RUN apt-get install -y gcc

Here we have two images, and both of them install gcc. However, the two installations of gcc will create two layers, because Docker can't tell that they're identical. This is a waste of space.

What you can do is create a Dockerfile which defines a base image:

# Base image
FROM ubuntu:latest
RUN apt-get install -y gcc

Then, you run docker build -t my-cool-base-image .

Now, you can reference the base image like this:

# Image A
FROM my-cool-base-image:latest
RUN apt-get install -y foo
# No need to install gcc here

Now your two containers will share a single copy of gcc on-disk.