Search code examples
bazeldhall

How can I access the output of a bazel rule from another rule without using a relative path?


I am attempting to use Bazel to compile a dhall program based on dhall-kubernetes to generate a Kubernetes YAML file.

The basic dhall compile without dhall-kubernetes using a simple bazel macro works ok.

I have made an example using dhall's dependency resolution to download dhall-kubernetes - see here. This also works but is very slow (I think because dhall downloads each remote file separately), and introduces a network dependency to the bazel rule execution, which I would prefer to avoid.

My preferred approach is to use Bazel to download an archive release version of dhall-kubernetes, then have the rule access it locally (see here). My solution requires a relative path in Prelude.dhall and package.dhall for the examples/k8s package to reference dhall-kubernetes. While it works, I am concerned that this is subverting the Bazel sandbox by requiring special knowledge of the folder structure used internally by Bazel. Is there a better way?

Prelude.dhall:

../../external/dhall-kubernetes/1.17/Prelude.dhall 

WORKSPACE:

workspace(name = "dhall")

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

DHALL_KUBERNETES_VERSION = "4.0.0"

http_archive(
    name = "dhall-kubernetes",
    sha256 = "0bc2b5d2735ca60ae26d388640a4790bd945abf326da52f7f28a66159e56220d",
    url = "https://github.com/dhall-lang/dhall-kubernetes/archive/v%s.zip" % DHALL_KUBERNETES_VERSION,
    strip_prefix = "dhall-kubernetes-4.0.0",
    build_file = "@//:BUILD.dhall-kubernetes",
)

BUILD.dhall-kubernetes:

package(default_visibility=['//visibility:public'])

filegroup(
    name = "dhall-k8s-1.17",
    srcs = glob([
        "1.17/**/*",
    ]),
)

examples/k8s/BUILD:

package(default_visibility = ["//visibility:public"])

genrule(
    name = "special_ingress",
    srcs = ["ingress.dhall",
            "Prelude.dhall",
            "package.dhall",
        "@dhall-kubernetes//:dhall-k8s-1.17"
    ],
    outs = ["ingress.yaml"],
    cmd = "dhall-to-yaml --file $(location ingress.dhall) > $@",
    visibility = [
        "//visibility:public"
    ]
)

Solution

  • There is a way to instrument dhall to do "offline" builds, meaning that the package manager fetches all Dhall dependencies instead of Dhall fetching them.

    In fact, I implemented something exactly this for Nixpkgs, which you may be able to translate to Bazel:

    High-level explanation

    The basic trick is to take advantage of a feature of Dhall's import system, which is that if a package protected by a semantic integrity check (i.e. a "semantic hash") is cached then Dhall will use the cache instead of fetching the package. You can build upon this trick to have the package manager bypass Dhall's remote imports by injecting dependencies in this way.

    You can find the Nix-related logic for this here:

    ... but I will try to explain how it works in a package-manager-independent way.

    Package structure

    First, the final product of a Dhall "package" built using Nix is a directory with the following structure:

    $ nix-build --attr 'dhallPackages.Prelude'         
    …
    
    $ tree -a ./result
    ./result
    ├── .cache
    │   └── dhall
    │       └── 122026b0ef498663d269e4dc6a82b0ee289ec565d683ef4c00d0ebdd25333a5a3c98
    └── binary.dhall
    
    2 directories, 2 files
    

    The contents of this directory are:

    • ./cache/dhall/1220XXX…XXX

      A valid cache directory for Dhall containing a single build product: the binary encoding of the interpreted Dhall expression.

      You can create such a binary file using dhall encode and you can compute the file name by replacing the XXX…XXX above with the sha256 encoding of the expression, which you can obtain using the dhall hash command.

    • ./binary.dhall

      A convenient Dhall file containing the expression missing sha256:XXX…XXX. Interpreting this expression only succeeds if the expression we built matching the hash sha256:XXX…XXX is already cached.

      The file is called binary.dhall because this is the Dhall equivalent of a "binary" package distribution, meaning that the import can only be obtained from a binary cache and cannot be fetched and interpreted from source.

    • Optional: ./source.dhall

      This is a file containing a fully αβ-normalized expression equivalent to the expression that was cached. By default, this should be omitted for all packages except perhaps the top-level package, since it contains the same expression that is stored inside of ./cache/1220XXX…XXX, albeit less efficiently (since the binary encoding is more compact)

      This file is called ./source.dhall because this is the Dhall equivalent of a "source" package distribution, which contains valid source code to produce the same result.

    User interface

    The function for building a package takes four arguments:

    • The package name

      This is not material to the build. It's just to name things since every Nix package has to have a human-readable name.

    • The dependencies for the build

      Each of these dependencies is a build product that produces a directory tree just like the one I described above (i.e. a ./cache directory, a ./binary.dhall file, and an optional ./source.dhall file)

    • A Dhall expression

      This is can be arbitrary Dhall source code, with only one caveat: all remote imports transitively referenced by the expression must be protected by integrity checks AND those imports must match one of the dependencies to this Dhall package (so that the import can be satisfied via the cache instead of the Dhall runtime fetching the URL)

    • A boolean option specifying whether to keep the ./source.dhall file, which is False by default

    Implementation

    The way that the Dhall package builder works is:

    • First, build the Haskell Dhall package with the -f-with-http flag

      This flag compiles out support for HTTP remote imports, that way if the user forgets to supply a dependency for a remote import they will get an error message saying Import resolution is disabled

      We'll be using this executable for all of the subsequent steps

    • Create a cache directory within the current working directory named .cache/dhall

      ... and populate the cache directory with the binary files stored inside each dependency's ./cache/ directory

    • Configure the interpreter to use the cache directory we created

      ... by setting XDG_CACHE_HOME to point to the .cache directory we just created in our current working directory

    • Interpret and α-normalize the Dhall source code for our package

      ... using the dhall --alpha command. Save the result to $out/source.dhall where $out is the directory that will store the final build product

    • Obtain the expression's hash

      ... using the dhall hash command. We will need this hash for the following two steps.

    • Create the corresponding binary cache file

      ... using the dhall encode command and save the file to $out/cache/dhall/1220${HASH}

    • Create the ./binary.dhall file

      ... by just writing out a text file to $out/binary.dhall containing missing sha256:${HASH}

    • Optional: Delete the ./source.dhall file

      ... if the user did not request to keep the file. Omitting this file by default helps conserve space within the package store by not storing the same expression twice (as both a binary file and source code).

    Packaging conventions

    Once you have this function, there are a couple of conventions that can help simplify doing things "in the large"

    • By default, a package should build a project's ./package.dhall file

    • Make it easy to override the package version

    • Make it easy to override the file built within the package

      In other words, if a user prefers to import individual files like https://prelude.dhall-lang.org/List/map instead of the top-level ./package.dhall file there should be a way for them to specify a dependency like Prelude.override { file = "./List/map"; } to obtain a package that builds and caches that individual file.

    Conclusion

    I hope that helps! If you have more questions about how to do this you can either ask them here or you can also discuss more on our Discourse forum, especially on the thread where this idiom first originated: