Search code examples
dockeryarnpkg

How do I make yarn cache modules when building containers?


This is my Dockerfile for local development:

FROM node:12-alpine

WORKDIR /usr/app

ENV __DEV__ 1

COPY package.json ./
COPY yarn.lock ./
RUN yarn --frozen-lockfile

COPY tsconfig.json ./
COPY nodemon.json ./

RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]

CMD [ "yarn", "dev" ]

This is how I build it:

docker build --rm -f Dockerfile.dev --tag my-app .

This is how I run it:

docker run --rm -it --volume $(pwd)/src:/usr/app/src -p 3000:3000 my-app

I need to build it only when something outside the src folder changes. For instance, when I install node modules. How to I make yarn to cache modules somewhere, so it would not pull all modules on each build.


Solution

  • The next generation of building containers with Docker is using Buildkit. I recommend using it, especially since it has an elegant solution for caching issues. There really isn't a good solution for this in vanilla Docker at the moment; while you can work around it, it's very cumbersome.

    I'll list both solutions here:

    With Buildkit

    Tarun's answer is on the right track, but there's a cleaner way of doing it. Buildkit has support for specifying a mount as a cache. Once you've set up Docker to use Buildkit, all we need to do is:

    ...
    RUN --mount=type=cache,target=/root/.yarn YARN_CACHE_FOLDER=/root/.yarn yarn install
    ...
    

    This will automatically pull in the previous run's cache or create it if it doesn't exist yet or has expired. It's that simple.

    Vanilla Docker

    Alternatively, you can use vanilla Docker if using Buildkit isn't an option. The best thing we can do here is use a COPY directive to copy in some sort of "cache" located in the build context. For example, if we create a directory .yarn_cache in the root of your build context, then we can provide a cache with:

    ...
    COPY .yarn_cache /root/.yarn
    RUN yarn --frozen-lockfile
    ...
    

    This external cache will not be updated when your image is built, and it will need to be initialized and periodically updated outside of your image. You can do this with the following shell command (clear any local node_modules on the first run to force it to warm the cache):

    $ YARN_CACHE_FOLDER=.yarn_cache yarn install
    

    Now while this works, it's very hack-y and comes with some downsides:

    • You need to manually create and update the cache.
    • The entire .yarn_cache directory needs to be included in the build context, which can be very slow, not to mention it will have to do this on every build, even when nothing has changed.

    For these reasons, the former solution is preferred.


    Bonus Pro Tip: Including the yarn cache in either case above still leave it in the final image, increasing its size. If you use a multistage build, you can alleviate this issue:

    # syntax = docker/dockerfile:1.2
    FROM node:12-alpine as BUILDER
    
    WORKDIR /usr/app
    
    COPY package.json ./
    COPY yarn.lock ./
    RUN --mount=type=cache,target=/root/.yarn YARN_CACHE_FOLDER=/root/.yarn yarn --frozen-lockfile
    
    
    FROM node:12-alpine
    
    WORKDIR /usr/app
    
    COPY --from=BUILDER node_modules ./node_modules
    
    
    COPY package.json ./
    COPY yarn.lock ./
    COPY tsconfig.json ./
    COPY nodemon.json ./
    
    RUN apk add --no-cache tini
    ENTRYPOINT [ "/sbin/tini", "--" ]
    
    ENV __DEV__=1
    
    CMD [ "yarn", "dev" ]