Search code examples
bazel

Bazel: How do you get the path to a generated file?


In Bazel, given a build target, how would a script (which is running outside of Bazel) get the path to the generated file?

Scenario: I'm using Bazel to do the build, and then when it's done, I want to copy the result to a server. I just need to know what files to copy. I could hard-code the list of files, but I would prefer not to do that.

A simple example: This Bazel script:

genrule(
    name = "main",
    srcs = ["main.in"],
    outs = ["main.out"],
    cmd = "cp $< $@",
)

If you then make a file named main.in and then run bazel build :main, bazel reports:

INFO: Found 1 target...
Target //:main up-to-date:
  bazel-genfiles/main.out
INFO: Elapsed time: 6.427s, Critical Path: 0.40s

So there is is: bazel-genfiles/main.out. But what machine-readable technique can I use to get that path? (I could parse the output of bazel build, but we are discouraged from doing that.)

The closest I have found is to use bazel query --output=xml :main, which dumps information about :main in XML format. The output includes this line:

<rule-output name="//:main.out"/>

That is so close to what I want. But the name is in Bazel's label format; I don't see how to get it as a path.

I could do some kind of string replacement on that name field, to turn it into bazel-genfiles/main.out; but even that isn't reliable. If my genrule had included output_to_bindir = 1, then the output would have been bazel-bin/main.out instead.

Furthermore, not all rules have a <rule-output> field in the XML output. For example, if my BUILD file has this code to make a C library:

cc_library(
    name = "mylib",
    srcs = glob(["*.c"])
)

The output of bazel query --output=xml :mylib does not contain a <rule-output> or anything else helpful:

<?xml version="1.1" encoding="UTF-8" standalone="no"?>
<query version="2">
  <rule class="cc_library" location="/Users/mikemorearty/src/bazel/test1/BUILD:8:1" name="//:mylib">
    <string name="name" value="mylib"/>
    <list name="srcs">
      <label value="//:foo.c"/>
    </list>
    <rule-input name="//:foo.c"/>
    <rule-input name="//tools/defaults:crosstool"/>
    <rule-input name="@bazel_tools//tools/cpp:stl"/>
  </rule>
</query>

Solution

  • You can get this information by using bazel aquery to query the action graph.

    Here’s a slightly richer example, with two output files from a single genrule:

    $ ls
    BUILD  main.in  WORKSPACE
    $ cat WORKSPACE
    $ cat BUILD
    genrule(
        name = "main",
        srcs = ["main.in"],
        outs = ["main.o1", "main.o2"],
        cmd = "cp $< $(location main.o1); cp $< $(location main.o2)",
    )
    $ cat main.in
    hello
    

    Use bazel aquery //:main --output=textproto to query the action graph with machine-readable output (the proto is analysis.ActionGraphContainer):

    $ bazel aquery //:main --output=textproto >aquery_result 2>/dev/null
    $ cat aquery_result
    artifacts {
      id: "0"
      exec_path: "main.in"
    }
    artifacts {
      id: "1"
      exec_path: "external/bazel_tools/tools/genrule/genrule-setup.sh"
    }
    artifacts {
      id: "2"
      exec_path: "bazel-out/k8-fastbuild/genfiles/main.o1"
    }
    artifacts {
      id: "3"
      exec_path: "bazel-out/k8-fastbuild/genfiles/main.o2"
    }
    actions {
      target_id: "0"
      action_key: "dd7fd759bbecce118a399c6ce7b0c4aa"
      mnemonic: "Genrule"
      configuration_id: "0"
      arguments: "/bin/bash"
      arguments: "-c"
      arguments: "source external/bazel_tools/tools/genrule/genrule-setup.sh; cp main.in bazel-out/k8-fastbuild/genfiles/main.o1; cp main.in bazel-out/k8-fastbuild/genfiles/main.o2"
      input_dep_set_ids: "0"
      output_ids: "2"
      output_ids: "3"
    }
    targets {
      id: "0"
      label: "//:main"
      rule_class_id: "0"
    }
    dep_set_of_files {
      id: "0"
      direct_artifact_ids: "0"
      direct_artifact_ids: "1"
    }
    configuration {
      id: "0"
      mnemonic: "k8-fastbuild"
      platform_name: "k8"
    }
    rule_classes {
      id: "0"
      name: "genrule"
    }
    

    The data isn’t exactly all in one place, but note that:

    • the artifacts with IDs 2 and 3 correspond to our two desired output files, and list the output locations of those artifacts as paths to files on disk relative to your workspace root;
    • the artifacts entry with target ID 0 is associated with artifact IDs 2 and 3; and
    • the targets entry with ID "0" is associated with the //:main label.

    Given this simple structure, we can easily whip together a script to list all output files corresponding to a provided label. I can’t find a way to depend directly on Bazel’s definition of analysis.proto or its language bindings from an external repository, so you can patch the following script into the bazelbuild/bazel repository itself:

    tools/list_outputs/list_outputs.py

    # Copyright 2019 The Bazel Authors. All rights reserved.
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #    http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    r"""Parse an `aquery` result to list outputs created for a target.
    
    Use this binary in conjunction with `bazel aquery` to determine the
    paths on disk to output files of a target.
    
    Example usage: first, query the action graph for the target that you
    want to analyze:
    
        bazel aquery //path/to:target --output=textproto >/tmp/aquery_result
    
    Then, from the Bazel repository:
    
        bazel run //tools/list_outputs -- \
            --aquery_result /tmp/aquery_result \
            --label //path/to:target \
            ;
    
    This will print a list of zero or more output files emitted by the given
    target, like:
    
        bazel-out/k8-fastbuild/foo.genfile
        bazel-out/k8-fastbuild/bar.genfile
    
    If the provided label does not appear in the output graph, an error will
    be raised.
    """
    from __future__ import absolute_import
    from __future__ import division
    from __future__ import print_function
    
    import sys
    
    from absl import app
    from absl import flags
    from google.protobuf import text_format
    from src.main.protobuf import analysis_pb2
    
    
    flags.DEFINE_string(
        "aquery_result",
        None,
        "Path to file containing result of `bazel aquery ... --output=textproto`.",
    )
    flags.DEFINE_string(
        "label",
        None,
        "Label whose outputs to print.",
    )
    
    
    def die(message):
      sys.stderr.write("fatal: %s\n" % (message,))
      sys.exit(1)
    
    
    def main(unused_argv):
      if flags.FLAGS.aquery_result is None:
        raise app.UsageError("Missing `--aquery_result` argument.")
      if flags.FLAGS.label is None:
        raise app.UsageError("Missing `--label` argument.")
    
      if flags.FLAGS.aquery_result == "-":
        aquery_result = sys.stdin.read()
      else:
        with open(flags.FLAGS.aquery_result) as infile:
          aquery_result = infile.read()
      label = flags.FLAGS.label
    
      action_graph_container = analysis_pb2.ActionGraphContainer()
      text_format.Merge(aquery_result, action_graph_container)
    
      matching_targets = [
          t for t in action_graph_container.targets
          if t.label == label
      ]
      if len(matching_targets) != 1:
        die(
            "expected exactly one target with label %r; found: %s"
            % (label, sorted(t.label for t in matching_targets))
        )
      target = matching_targets[0]
    
      all_artifact_ids = frozenset(
          artifact_id
          for action in action_graph_container.actions
          if action.target_id == target.id
          for artifact_id in action.output_ids
      )
      for artifact in action_graph_container.artifacts:
        if artifact.id in all_artifact_ids:
          print(artifact.exec_path)
    
    
    if __name__ == "__main__":
      app.run(main)
    

    tools/list_outputs/BUILD

    # Copyright 2019 The Bazel Authors. All rights reserved.
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #    http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    
    package(default_visibility = ["//visibility:public"])
    
    licenses(["notice"])  # Apache 2.0
    
    filegroup(
        name = "srcs",
        srcs = glob(["**"]),
    )
    
    py_binary(
        name = "list_outputs",
        srcs = ["list_outputs.py"],
        srcs_version = "PY2AND3",
        deps = [
            "//third_party/py/abseil",
            "//src/main/protobuf:analysis_py_proto",
        ],
    )
    

    As a Git patch, for your convenience: https://gist.github.com/wchargin/5e6a43a203d6c95454aae2886c5b54e4

    Please note that this code hasn’t been reviewed or verified for correctness; I provide it primarily as an example. If it’s useful to you, then maybe this weekend I can write some tests for it and PR it against Bazel itself.