Bazel: How do you get the path to a generated file?

In Bazel, given a build target, how would a script (which is running outside of Bazel) get the path to the generated file?

Scenario: I'm using Bazel to do the build, and then when it's done, I want to copy the result to a server. I just need to know what files to copy. I could hard-code the list of files, but I would prefer not to do that.

A simple example: This Bazel script:

genrule(
    name = "main",
    srcs = ["main.in"],
    outs = ["main.out"],
    cmd = "cp $< $@",
)

If you then make a file named main.in and then run bazel build :main, bazel reports:

INFO: Found 1 target...
Target //:main up-to-date:
  bazel-genfiles/main.out
INFO: Elapsed time: 6.427s, Critical Path: 0.40s

So there is is: bazel-genfiles/main.out. But what machine-readable technique can I use to get that path? (I could parse the output of bazel build, but we are discouraged from doing that.)

The closest I have found is to use bazel query --output=xml :main, which dumps information about :main in XML format. The output includes this line:

<rule-output name="//:main.out"/>

That is so close to what I want. But the name is in Bazel's label format; I don't see how to get it as a path.

I could do some kind of string replacement on that name field, to turn it into bazel-genfiles/main.out; but even that isn't reliable. If my genrule had included output_to_bindir = 1, then the output would have been bazel-bin/main.out instead.

Furthermore, not all rules have a <rule-output> field in the XML output. For example, if my BUILD file has this code to make a C library:

cc_library(
    name = "mylib",
    srcs = glob(["*.c"])
)

The output of bazel query --output=xml :mylib does not contain a <rule-output> or anything else helpful:

<?xml version="1.1" encoding="UTF-8" standalone="no"?>
<query version="2">
  <rule class="cc_library" location="/Users/mikemorearty/src/bazel/test1/BUILD:8:1" name="//:mylib">
    <string name="name" value="mylib"/>
    <list name="srcs">
      <label value="//:foo.c"/>
    </list>
    <rule-input name="//:foo.c"/>
    <rule-input name="//tools/defaults:crosstool"/>
    <rule-input name="@bazel_tools//tools/cpp:stl"/>
  </rule>
</query>

Solution

You can get this information by using bazel aquery to query the action graph.

Here’s a slightly richer example, with two output files from a single genrule:

$ ls
BUILD  main.in  WORKSPACE
$ cat WORKSPACE
$ cat BUILD
genrule(
    name = "main",
    srcs = ["main.in"],
    outs = ["main.o1", "main.o2"],
    cmd = "cp $< $(location main.o1); cp $< $(location main.o2)",
)
$ cat main.in
hello

Use bazel aquery //:main --output=textproto to query the action graph with machine-readable output (the proto is analysis.ActionGraphContainer):

$ bazel aquery //:main --output=textproto >aquery_result 2>/dev/null
$ cat aquery_result
artifacts {
  id: "0"
  exec_path: "main.in"
}
artifacts {
  id: "1"
  exec_path: "external/bazel_tools/tools/genrule/genrule-setup.sh"
}
artifacts {
  id: "2"
  exec_path: "bazel-out/k8-fastbuild/genfiles/main.o1"
}
artifacts {
  id: "3"
  exec_path: "bazel-out/k8-fastbuild/genfiles/main.o2"
}
actions {
  target_id: "0"
  action_key: "dd7fd759bbecce118a399c6ce7b0c4aa"
  mnemonic: "Genrule"
  configuration_id: "0"
  arguments: "/bin/bash"
  arguments: "-c"
  arguments: "source external/bazel_tools/tools/genrule/genrule-setup.sh; cp main.in bazel-out/k8-fastbuild/genfiles/main.o1; cp main.in bazel-out/k8-fastbuild/genfiles/main.o2"
  input_dep_set_ids: "0"
  output_ids: "2"
  output_ids: "3"
}
targets {
  id: "0"
  label: "//:main"
  rule_class_id: "0"
}
dep_set_of_files {
  id: "0"
  direct_artifact_ids: "0"
  direct_artifact_ids: "1"
}
configuration {
  id: "0"
  mnemonic: "k8-fastbuild"
  platform_name: "k8"
}
rule_classes {
  id: "0"
  name: "genrule"
}

The data isn’t exactly all in one place, but note that:

the artifacts with IDs 2 and 3 correspond to our two desired output files, and list the output locations of those artifacts as paths to files on disk relative to your workspace root;
the artifacts entry with target ID 0 is associated with artifact IDs 2 and 3; and
the targets entry with ID "0" is associated with the //:main label.

Given this simple structure, we can easily whip together a script to list all output files corresponding to a provided label. I can’t find a way to depend directly on Bazel’s definition of analysis.proto or its language bindings from an external repository, so you can patch the following script into the bazelbuild/bazel repository itself:

tools/list_outputs/list_outputs.py

# Copyright 2019 The Bazel Authors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
r"""Parse an `aquery` result to list outputs created for a target.

Use this binary in conjunction with `bazel aquery` to determine the
paths on disk to output files of a target.

Example usage: first, query the action graph for the target that you
want to analyze:

    bazel aquery //path/to:target --output=textproto >/tmp/aquery_result

Then, from the Bazel repository:

    bazel run //tools/list_outputs -- \
        --aquery_result /tmp/aquery_result \
        --label //path/to:target \
        ;

This will print a list of zero or more output files emitted by the given
target, like:

    bazel-out/k8-fastbuild/foo.genfile
    bazel-out/k8-fastbuild/bar.genfile

If the provided label does not appear in the output graph, an error will
be raised.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import sys

from absl import app
from absl import flags
from google.protobuf import text_format
from src.main.protobuf import analysis_pb2


flags.DEFINE_string(
    "aquery_result",
    None,
    "Path to file containing result of `bazel aquery ... --output=textproto`.",
)
flags.DEFINE_string(
    "label",
    None,
    "Label whose outputs to print.",
)


def die(message):
  sys.stderr.write("fatal: %s\n" % (message,))
  sys.exit(1)


def main(unused_argv):
  if flags.FLAGS.aquery_result is None:
    raise app.UsageError("Missing `--aquery_result` argument.")
  if flags.FLAGS.label is None:
    raise app.UsageError("Missing `--label` argument.")

  if flags.FLAGS.aquery_result == "-":
    aquery_result = sys.stdin.read()
  else:
    with open(flags.FLAGS.aquery_result) as infile:
      aquery_result = infile.read()
  label = flags.FLAGS.label

  action_graph_container = analysis_pb2.ActionGraphContainer()
  text_format.Merge(aquery_result, action_graph_container)

  matching_targets = [
      t for t in action_graph_container.targets
      if t.label == label
  ]
  if len(matching_targets) != 1:
    die(
        "expected exactly one target with label %r; found: %s"
        % (label, sorted(t.label for t in matching_targets))
    )
  target = matching_targets[0]

  all_artifact_ids = frozenset(
      artifact_id
      for action in action_graph_container.actions
      if action.target_id == target.id
      for artifact_id in action.output_ids
  )
  for artifact in action_graph_container.artifacts:
    if artifact.id in all_artifact_ids:
      print(artifact.exec_path)


if __name__ == "__main__":
  app.run(main)

tools/list_outputs/BUILD

# Copyright 2019 The Bazel Authors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

package(default_visibility = ["//visibility:public"])

licenses(["notice"])  # Apache 2.0

filegroup(
    name = "srcs",
    srcs = glob(["**"]),
)

py_binary(
    name = "list_outputs",
    srcs = ["list_outputs.py"],
    srcs_version = "PY2AND3",
    deps = [
        "//third_party/py/abseil",
        "//src/main/protobuf:analysis_py_proto",
    ],
)

As a Git patch, for your convenience: https://gist.github.com/wchargin/5e6a43a203d6c95454aae2886c5b54e4

Please note that this code hasn’t been reviewed or verified for correctness; I provide it primarily as an example. If it’s useful to you, then maybe this weekend I can write some tests for it and PR it against Bazel itself.