In GCP building bazel based stuff on a cloudbuild.yaml
, I am using the waitFor: ['-']
keyword to build in parallel and am using the bazel gcr.io/cloud-builders/bazel -builder. When I am trying to build multiple steps with the above builder and using the waitFor: ['-']
This would create a parallel build. The issue is when trying this way, I get the error as shown below and the build fails, however when I remove the waitFor: ['-']
keyword, the building occurs sequentially and the build goes through successfully. Is there any Bazel configuration I must change in the Gcloud's Bazel builder? Error is shown below while building in parallel:
Another command holds the client lock:
pid=12
owner=client
cwd=/workspace
Waiting for it to complete...
Another command holds the client lock:
pid=13
owner=client
cwd=/workspace
Waiting for it to complete...
Starting local Bazel server and connecting to it...
My cloudbuild.yaml looks like this below:
steps:
- name: "gcr.io/cloud-builders/bazel"
id: "Building Bazel components and Uploading the component manifest for ml_cmp_1 "
entrypoint: "bash"
args:
- "-c"
- |
cmp_dir=components/train_test_split_1
cmp_bazel_file="$cmp_dir/BUILD.bazel"
bazel run --remote_cache=${_BAZEL_CACHE_URL} --google_default_credentials --define=PROJ_ID=${_PROJECT_ID} //$cmp_dir:container_push
waitFor: ["-"]
- name: "gcr.io/cloud-builders/bazel"
id: "Building Bazel components and Uploading the component manifest for ml_cmp_2 "
entrypoint: "bash"
args:
- "-c"
- |
cmp_dir=components/train_test_split_2
cmp_bazel_file="$cmp_dir/BUILD.bazel"
bazel run --remote_cache=${_BAZEL_CACHE_URL} --google_default_credentials --define=PROJ_ID=${_PROJECT_ID} //$cmp_dir:container_push
waitFor: ["-"]
timeout: 86399s
logsBucket: gs://some_project_id_cloudbuild/logs
options:
machineType: 'N1_HIGHCPU_8'
substitution_option: 'ALLOW_LOOSE'
substitutions:
_PROJECT_ID: "some_project_id"
_BAZEL_CACHE_URL: "https://storage.googleapis.com/some_project_id_cloudbuild/bazel-cache"
If you want to run multiple bazel commands in parallel, they each need their own --output_base flag. That will make each build independent.
If you want them to share intermediate outputs within the same build machine, a shared --disk_cache is the simplest approach. You could also set up a full remote cache which would cache outputs across build machines. If you want parallel builds to deduplicate common actions fully, you'll need to set up remote execution.