Search code examples
debuggingdockercpanm

How can I inspect the file system of a failed `docker build`?


I'm trying to build a new Docker image for our development process, using cpanm to install a bunch of Perl modules as a base image for various projects.

While developing the Dockerfile, cpanm returns a failure code because some of the modules did not install cleanly.

I'm fairly sure I need to get apt to install some more things.

Where can I find the /.cpanm/work directory quoted in the output, in order to inspect the logs? In the general case, how can I inspect the file system of a failed docker build command?

After running a find I discovered

/var/lib/docker/aufs/diff/3afa404e[...]/.cpanm

Is this reliable, or am I better off building a "bare" container and running stuff manually until I have all the things I need?


Solution

  • Everytime docker successfully executes a RUN command from a Dockerfile, a new layer in the image filesystem is committed. Conveniently you can use those layers ids as images to start a new container.

    Take the following Dockerfile:

    FROM busybox
    RUN echo 'foo' > /tmp/foo.txt
    RUN echo 'bar' >> /tmp/foo.txt
    

    and build it: (you can see the image layer id, when you set DOCKER_BUILDKIT=0)

    $ DOCKER_BUILDKIT=0 docker build -t so-26220957 .
    Sending build context to Docker daemon 47.62 kB
    Step 1/3 : FROM busybox
     ---> 00f017a8c2a6
    Step 2/3 : RUN echo 'foo' > /tmp/foo.txt
     ---> Running in 4dbd01ebf27f
     ---> 044e1532c690
    Removing intermediate container 4dbd01ebf27f
    Step 3/3 : RUN echo 'bar' >> /tmp/foo.txt
     ---> Running in 74d81cb9d2b1
     ---> 5bd8172529c1
    Removing intermediate container 74d81cb9d2b1
    Successfully built 5bd8172529c1
    

    You can now start a new container from 00f017a8c2a6, 044e1532c690 and 5bd8172529c1:

    $ docker run --rm 00f017a8c2a6 cat /tmp/foo.txt
    cat: /tmp/foo.txt: No such file or directory
    
    $ docker run --rm 044e1532c690 cat /tmp/foo.txt
    foo
    
    $ docker run --rm 5bd8172529c1 cat /tmp/foo.txt
    foo
    bar
    

    of course you might want to start a shell to explore the filesystem and try out commands:

    $ docker run --rm -it 044e1532c690 sh      
    / # ls -l /tmp
    total 4
    -rw-r--r--    1 root     root             4 Mar  9 19:09 foo.txt
    / # cat /tmp/foo.txt 
    foo
    

    When one of the Dockerfile command fails, what you need to do is to look for the id of the preceding layer and run a shell in a container created from that id:

    docker run --rm -it <id_last_working_layer> bash -il
    

    Once in the container:

    • try the command that failed, and reproduce the issue
    • then fix the command and test it
    • finally update your Dockerfile with the fixed command

    If you really need to experiment in the actual layer that failed instead of working from the last working layer, see Drew's answer.


    UPDATE
    Intermediate container hashes are not supported as of Docker version 20.10. See Jannis Schönleber's answer.