Search code examples
bazel

How to write a genrule to apply a patch


I'm trying to write a genrule in bazel BUILD for one of the packages. The purpose is that rule must apply one patch on the package source. I've written it as below -

genrule(
    name = "patching_rule",
    srcs = ["Package"],
    outs = ["test_output.txt"],
    cmd = "cd $(location Package); patch -p0 < /tmp/mypatch.patch",
)

Reading the bazel BUILD document, I came to know that "outs" is a required field. However, my patch is surely going to generate nothing. It's just 2-3 lines of code change that it will make in Package source code. I can't keep "outs" empty, also unable to add a dummy file there. Could anyone please help me how to fix this problem?

Thanks in advance, Nishidha


Solution

  • As said in the comment, if you wish to patch in a genrule you need to declare the sources to patch as inputs and the resulting source as outputs, genrule, and Bazel build does not allow to modify the input tree in general.

    However, since this specific case is for patching an external repository (TensorFlow), you can replace whichever repository you are using (probably a local_repository) in the WORKSPACE file with a custom implementation (let's name it local_patched_repository), so the WORKSPACE file part will look like:

    load("//:local_patched_repository.bzl", "local_patched_repository")
    local_patched_repository(
        name = "org_tensorflow",
        path = "tensorflow",
        patch = "//:mypatch.patch",
    )
    

    With a BUILD file (can be empty), mypatch.patch and local_patched_repository.bzl next to the WORKSPACE file. Now the content of local_patched_repository.bzl would look like:

    def _impl(rctxt):
      path = rtcxt.attr.path
      # This path is a bit ugly to get the actual path if it is relative.
      if path[0] != "/":
        # rctxt.path(Label("//:BUILD")) will returns a path to the BUILD file
        # in the current workspace, so getting the dirname get the path
        # relative to the workspace.
        path = rctxt.path(Label("//:BUILD")).dirname + "/" + path
      # Copy the repository
      result = rctxt.execute(["cp", "-fr", path + "/*", rctxt.path()])
      if result.return_code != 0:
        fail("Failed to copy %s (%s)" % (rctxt.attr.path, result.return_code))
      # Now patch the repository
      patch_file = str(rctxt.path(rctxt.attr.patch).realpath)
      result = rctxt.execute(["bash", "-c", "patch -p0 < " + patch_file])
      if result.return_code != 0:
        fail("Failed to patch (%s): %s" % (result.return_code, result.stderr))
    
    local_patched_repository = repository_rule(
        implementation=_impl,
        attrs={
            "path": attr.string(mandatory=True),
            "patch": attr.label(mandatory=True)
        },
        local = True)
    

    Of course this is a quick implementation and there is a catch to it: local = True will make this repository being recomputed a lot and if patching is slow, you might want to remove it (which means we won't see change in files in the tensorflow repository). It won't get rebuild normally unless you do change a file, unless you hit a bazel bug.

    You can also replace the cp by rctx.download_and_extract if you do want to replace http_repository (but tensorflow still requires some modification that ./configure works which make it incompatible with http_repository).

    EDIT: A patch to patch on the fly the eigen http_repository on TensorFlow