Search code examples
pythondockergoogle-cloud-build

Dockerfile COPY command is missing a single file when using `gcloud build`


I have run into an incredibly frustrating problem where a COPY command in my Dockerfile successfully copies all of my apps files except one. I do not have a .dockerignore file so I know the file isn't being excluded from the build that way.

Note: I do have a .gitignore which is excluding file2.json that I do not want to version. But as you will see below, I'm building from my local folder, not remotely from a clone/checkout so I don't see why .gitignore would influence the docker build in this case.

Below is what my directory looks like:

$ tree -a -I .git app
app
├── app
│   ├── data
│   │   ├── file1.txt
│   │   ├── file2.json
│   │   ├── file3.txt
│   │   └── file4.yml
│   ├── somefile2.py
│   └── somefile.py
├── Dockerfile
├── .gitignore
├── requirements.txt
└── setup.py

And this is what is in my Dockerfile looks like

FROM ubuntu:18.04
FROM python:3.7
  
COPY . /app
  
RUN cp app/app/data/file2.json ~/.somenewhiddendirectory
   
RUN pip install app/.
   
ENTRYPOINT ["python", "app/app/somefile.py"]

For some reason, file2.json is not being copied during the COPY . /app call and I am getting an error when I try to cp it somewhere else. I have done a call like RUN ls app/app/data/ and all the files except file2.json are in there. I checked the files permissions and made sure they are the same as all the other files. I have tried doing a direct COPY of that file which results in an error since Docker says that file doesn't exist.

On my system, that file exists, I can see it with ls, and I can cat its contents. I have played around with ensuring the context within the image is squarely within the root directory of my app, and like I said, all files are correctly copied except that json file. I can't for the life of my figure out why Docker hates this file.

For some added context, I am using Google's cloud build to build the image and the yaml config looks like this:

steps:
  - name: gcr.io/cloud-builders/docker
    id: base-image-build
    waitFor: [-]
    args:
      - build
      - .
      - -t
      - us.gcr.io/${PROJECT_ID}/base/${BRANCH_NAME}:${SHORT_SHA}
        
images:
  - us.gcr.io/${PROJECT_ID}/base/${BRANCH_NAME}:${SHORT_SHA}

and the command I am executing looks like this:

gcloud builds submit --config=cloudbuild.yaml . \
  --substitutions=SHORT_SHA="$(git rev-parse --short HEAD)",BRANCH_NAME="$(git rev-parse --abbrev-ref HEAD)"

Solution

  • Disclaimer: I have never used Google's cloud build so my answer is only based on read theory.


    I don't see why .gitignore would influence the docker build in this case

    Indeed, docker build in itself does not care about your .gitignore file. But you are building through Google's cloud build and this is a totally different story.

    Quoting the documentation for the source specification in gcloud build command:

    [SOURCE]
    The location of the source to build. The location can be a directory on a local disk or a gzipped archive file (.tar.gz) in Google Cloud Storage. If the source is a local directory, this command skips the files specified in the --ignore-file. If --ignore-file is not specified, use .gcloudignore file. If a .gcloudignore file is absent and a .gitignore file is present in the local source directory, gcloud will use a generated Git-compatible .gcloudignore file that respects your .gitignore files. The global .gitignore is not respected. For more information on .gcloudignore, see gcloud topic gcloudignore

    So in your given case, your file will be ignored even for a build from your local directory. At this point I see 2 options to workaround this problem:

    1. Remove the entry for your file in .gitignore so that the default gcloud mechanism does not ignore it during your build
    2. Provide a --ignore-file or a default .gcloudignore which actually re-includes the local file that is ignored for versioning.

    I would personally go for the second option with something super simple like the following .gcloudignore file (crafted from the relevant documentation)

    .git
    .gcloudignore
    .gitignore