Search code examples
pythondockerdockerfilepython-packaging

How to speed up clone of a Git Repo into Docker Container?


I have some code from an external Python repository that I use in a Dockerfile.

RUN git clone ssh://[email protected]/sample_repo.git /sample_repo

How can I get all of this code to be (A) accessible in the Docker container; (B) way faster than git clone; (C) pick up recent code changes in the repository?

Before I go down the path of creating a private Python package repository, I want to be sure I implement a solution that plays well with Docker and factors in all of the above.


Solution

  • If you want recent code changes to be fetched into an existing container, there isn't really a way around running git clone in the container, so that you can later git pull.

    If you don't need the entire history, then perhaps git clone --depth 1 would speed up the initial clone.

    RUN git clone --depth 1 ssh://[email protected]/sample_repo.git /sample_repo
    

    By providing an argument of --depth 1 to the clone command, the process will copy only the latest revision of everything in the repository. This can be a lifesaver for Git servers that might otherwise be overwhelmed by CI/CD (Continuous Integration / Continuous Delivery) automation.

    If you don't want git at all in the container, and are comfortable rebuilding the image to get code changes, then a helper script that does a git archive to the host machine, and then an ADD statement in the Dockerfile would work too.