I'm hoping to get my pip install
instructions inside my docker build
s as fast as possible.
I've read many posts explaining how adding your requirements.txt
before the rest of the app helps you take advantage of Docker's own image cache if your requirements.txt
hasn't changed. But this is no help at all when dependencies do change, even slightly.
The next step would be if we could use a consistent pip cache directory. By default, pip
will cache downloaded packages in ~/.cache/pip
(on Linux), and so if you're ever installing the same version of a module that has been installed before anywhere on the system, it shouldn't need to go and download it again, but instead simply use the cached version. If we could leverage a shared cache directory for docker builds, this could help speed up dependency installs a lot.
However, there doesn't appear to be any simple way to mount a volume while running docker build
. The build environment seems to be basically impenetrable. I found one article suggesting a genius but complex method of running an rsync
server on the host and then, with a hack inside the build to get the host IP, rsyncing the pip cache in from the host. But I'm not relishing the idea of running an rsync server in Jenkins (which isn't the most secure platform at the best of times).
Does anyone know if there's any other way to achieve a shared cache volume more simply?
I suggest you to use buildkit, also see this.
Dockerfile:
# syntax=docker/dockerfile:1
FROM python:3.6-alpine
RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml
NOTE: # syntax = docker/dockerfile:experimental
is a must,you have to add it at the beginning of Dockerfile to enable this feature.
1.
The first execute build:
export DOCKER_BUILDKIT=1
docker build --progress=plain -t abc:1 . --no-cache
The first log:
#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...
#9 digest: sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5
#9 name: "[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml"
#9 started: 2019-09-20 03:11:35.296107357 +0000 UTC
#9 1.955 Collecting pyyaml
#9 3.050 Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)
#9 5.006 Building wheels for collected packages: pyyaml
#9 5.007 Building wheel for pyyaml (setup.py): started
#9 5.249 Building wheel for pyyaml (setup.py): finished with status 'done'
#9 5.250 Created wheel for pyyaml: filename=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl size=44104 sha256=867daf35eab43c2d047ad737ea1e9eaeb4168b87501cd4d62c533f671208acaa
#9 5.250 Stored in directory: /root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030
#9 5.267 Successfully built pyyaml
#9 5.274 Installing collected packages: pyyaml
#9 5.309 Successfully installed pyyaml-5.1.2
#9completed: 2019-09-20 03:11:42.221146294 +0000 UTC
#9 duration: 6.925038937s
From above, you can see the first time, the build will download pyyaml from internet.
2.
The second execute build:
docker build --progress=plain -t abc:1 . --no-cache
The second log:
#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...
#9 digest: sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5
#9 name: "[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml"
#9 started: 2019-09-20 03:16:58.588157354 +0000 UTC
#9 1.786 Collecting pyyaml
#9 2.234 Installing collected packages: pyyaml
#9 2.270 Successfully installed pyyaml-5.1.2
#9completed: 2019-09-20 03:17:01.933398002 +0000 UTC
#9 duration: 3.345240648s
From above, you can see the build no longer download package from internet, just use the cache. NOTE, this is not the traditional docker build cache as I have use --no-cache
, it's /root/.cache/pip
which I mount into build.
3.
The third execute build which delete buildkit cache:
docker builder prune
docker build --progress=plain -t abc:1 . --no-cache
The third log:
#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...
#9 digest: sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5
#9 name: "[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml"
#9 started: 2019-09-20 03:19:07.434792944 +0000 UTC
#9 1.894 Collecting pyyaml
#9 2.740 Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)
#9 3.319 Building wheels for collected packages: pyyaml
#9 3.319 Building wheel for pyyaml (setup.py): started
#9 3.560 Building wheel for pyyaml (setup.py): finished with status 'done'
#9 3.560 Created wheel for pyyaml: filename=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl size=44104 sha256=cea5bc4689e231df7915c2fc3abca225d4ee2e869a7540682aacb6d42eb17053
#9 3.560 Stored in directory: /root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030
#9 3.580 Successfully built pyyaml
#9 3.585 Installing collected packages: pyyaml
#9 3.622 Successfully installed pyyaml-5.1.2
#9completed: 2019-09-20 03:19:12.530742712 +0000 UTC
#9 duration: 5.095949768s
From above, you can see if delete buildkit cache, the package download again.
In a word, it will give you a shared cache between several times build, and this cache will only be mounted when image build. But, the image self will not have these cache, so avoid a lots of intermediate layer in image.
EDIT for folks who are using docker compose and are lazy to read the comments...:
You can also do this with docker-compose if you set COMPOSE_DOCKER_CLI_BUILD=1. For example: COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose build –
UPDATE according to folk's question 2020/09/02:
I don't know from which version (my version now is 19.03.11), if not specify mode
for cache directory, the cache won't be reused by next time build.
Don't know the detail reason, but you could add mode=0755,
to Dockerfile to make it work again:
Dockerfile:
# syntax = docker/dockerfile:experimental
FROM python:3.6-alpine
RUN --mount=type=cache,mode=0755,target=/root/.cache/pip pip install pyyaml
UPDATE according to folk's question 2023/04/23:
Q: Where is the cache exactly on the host?
A: The cache on host is maintained by docker with an overlay. You could use next command docker buildx du --verbose
and find a entry type Type: exec.cachemount
, after that you got the ID: ntpjzcz8hhx31b80nwxji05hn
:
ID: ntpjzcz8hhx31b80nwxji05hn
Created at: 2023-04-23 01:36:41.102680066 +0000 UTC
Mutable: true
Reclaimable: true
Shared: false
Size: 3.601MB
Description: cached mount /root/.cache/pip from exec /bin/sh -c pip install pyyaml
Usage count: 2
Last used: 7 minutes ago
Type: exec.cachemount
Afterwards, go to /var/lib/docker/overlay2/ntpjzcz8hhx31b80nwxji05hn/diff/cache/wheels
to find the cached pyyaml (depends on the ID you got from above). For my station, it looks like next:
root@shdebian1:/var/lib/docker/overlay2/ntpjzcz8hhx31b80nwxji05hn/diff/cache/wheels/81/5a/02/b3447894318b70e3cbff3cb4f1a50d9d50a848185358de1d71# ls
PyYAML-6.0-cp36-cp36m-linux_x86_64.whl