Search code examples
linuxdockeralpine-linuxbusybox

What is the difference between alpine docker image and busybox docker image?


What is the difference between alpine docker image and busybox docker image ?

When I check their dockfiles, alpine is like this (for Alpine v3.12 - 3.12.7)

FROM scratch
ADD alpine-minirootfs-3.12.7-x86_64.tar.gz /
CMD ["/bin/sh"]

busybox is like this

FROM scratch
ADD busybox.tar.xz /
CMD ["sh"]

But as https://alpinelinux.org/about/ says

Alpine Linux is built around musl libc and busybox.

So what is exactly the difference ?

I am also curious that many docker images, (nodejs/nginx/php just name a few) provide images based on alpine but not on busybox. Why is that ? What is use case for busybox image then ? I need to emphasize that I am not looking for an answer about why A is better than B or vise versa or software recommendation.

I have been experiencing intermittent DNS lookup failure for my alpine docker, as here musl-libc - Alpine's Greatest Weakness and here Does Alpine have known DNS issue within Kubernetes? said. That is one of reasons I asked my question.

PS, https://musl.libc.org/ says "musl is an implementation of the C standard library built on top of the Linux system call API" and https://en.wikipedia.org/wiki/Alpine_Linux mentioned

It previously used uClibc as its C standard library instead of the traditional GNU C Library (glibc) most commonly used. Although it is more lightweight, it does have the significant drawback of being binary incompatible with glibc. Thus, all software must be compiled for use with uClibc to work properly. As of 9 April 2014,[16] Alpine Linux switched to musl, which is partially binary compatible with glibc.


Solution

  • The key difference between these is that older versions of the busybox image statically linked busybox against glibc (current versions dynamically link busybox against glibc due to use of libnss even in static configuration), whereas the alpine image dynamically links against musl libc.

    Going into the weighting factors used to choose between these in detail would be off-topic here (software recommendation requests), but some key points:

    Comparing glibc against musl libc, a few salient points (though there are certainly many other factors as well):

    • glibc is built for performance and portability over size (often adding special-case performance optimizations that take a large amount of code).
    • musl libc is built for correctness and size over performance (it's willing to be somewhat slower to have a smaller code size and to run in less RAM); and it's much more aggressive about having correct error reporting (instead of just exiting immediately) in the face of resource exhaustion.
    • glibc is more widely used, so bugs that manifest against its implementation tend to be caught more quickly. Often, when one is the first person to build a given piece of software against musl, one will encounter bugs (typically in that software, not in musl) or places where the maintainer explicitly chose to use GNU extensions instead of sticking to the libc standard.
    • glibc is licensed under LGPL terms; users need to be able to relink binaries against modified versions of that library, which for statically linked executables can boil down to needing to distribute source; whereas musl is under a MIT license, and usable with fewer restrictions.

    Comparing the advantages of a static build against a dynamic build:

    • If your system image will only have a single binary executable (written in C or otherwise using a libc), a static build is always better, as it discards any parts of your libraries that aren't actually used by that one executable.
    • If your system image is intended to have more binaries added that are written in C, using dynamic linking will keep the overall size down, since it allows those binaries to use the libc that's already there.
    • If your system image is intended to have more binaries added in a language that doesn't use libc (this can be the case for Go and Rust, f/e), then you don't benefit from dynamic linking; you don't need the unused parts of libc there because you won't be using them anyhow.

    Honestly, these two images don't between themselves cover the whole matrix space of possibilities; there are situations where neither of them is optimal. There would be value to having an image with only busybox that statically links against musl libc (if everything you're going to add is in a non-C language), or an image with busybox that dynamically links against glibc (if you're going to add more binaries that need libc and aren't compatible with musl).