Search code examples
datanucleusbazel

Datanucleus enhancement with Bazel


I'm trying to migrate Maven project to Bazel and having troubles with Datanucleus enhancement.

After jar-file is build, Datanucleus looks inside it and does some byte-code manipulation to enhance persistable classes. The way to perform this in Bazel is by defining a rule that takes the *.jar output of java_library rule and creates a new enhanced version of the library.

The problem that I have is that for my rule I need datanucleus-core package from external libraries. When I try to access it from a genrule by $(location //third_party:datanucleus_core) it point to a jar which has no classes:

(genrule) cmd = "echo $(location //third_party:datanucleus_core)"
bazel-out/local-fastbuild/bin/third_party/liborg_datanucleus_datanucleus_core.jar

(genrule) cmd = "jar tf $(location //third_party:datanucleus_core)"
META-INF/
META-INF/MANIFEST.MF

The jar-file resolved by Bazel in genrule from $(location //third_party:datanucleus_core) contains only META-INF/MANIFEST.MF with the following content:

Manifest-Version: 1.0
Created-By: blaze

I tried to use java_binary rule that adds a correct datanucleus_core.jar into classpath, but Datanucleus enhances my libary in-place and fails to write its changes on disk (rewrite the rule's input file). Also java_binary rule is not supposed to be used for building.

So the question is what is the best way to enhance jar library in Bazel running Datanucleus utility, which is provided as a third-party dependency in Maven repository?

Bazel build label: 0.3.2-homebrew, OS: OS X El Capitan (10.11.6), java: 1.8.0_92

Update

Datanucleus dependency declaration:

# WORKSPACE
maven_jar(
    name = "org_datanucleus_datanucleus_core",
    artifact = "org.datanucleus:datanucleus-core:5.0.3",
)

# third_party/BUILD
java_library(
    name = "org_datanucleus_datanucleus_core",
    visibility = ["//visibility:public"],
    exports = ["@org_datanucleus_datanucleus_core//jar"],
)

(in my question I shortened org_datanucleus_datanucleus_core to datanucleus_core)


Solution

  • As Neil Stockton mentioned, you cannot enhance classes in a jar. So, the basic strategy will be:

    1. Create the jar.
    2. Unjar the class files.
    3. Run the enhancements.
    4. Jar it back up.

    Steps 2 & 3 have to be rolled into 4, as Bazel insists that you declare all inputs & outputs to a build rule (and you cannot know what .class files a .java file will generate, so Bazel always jars them up).

    Create a datanucleus.bzl file to declare your enhancement rule in. It should look something like:

    # Run datastore enhancements on the java_library named "jarname".
    def enhance(jarname):
      # src is the name of the jar file your java_library rule generates.
      src = "lib" + jarname + ".jar"
      native.genrule(
          name = jarname + "-enhancement",
          srcs = [
              src, 
              "//third_party:datanucleus_core"
          ],
          outs = [jarname + "-enhanced.jar"],
          cmd = """
    # Un-jar the .class files.
    jar tf $(location {0})
    # Run the enhance.
    classes=""
    for $$class in $$(find . -name *.class); do
      java -cp {0}:$(location //third_party:datanucleus_core) $$class
      classes="$$classes $$class"
    done
    # jar them back up.
    jar cf $@ $$classes""".format(src),
      )
    

    (I'm not too familiar with datastore so the cmd might need some modification, but it should be that general idea.)

    Then, in your BUILD file, you'd do:

    java_library(
        name = "my-lib",
        srcs = glob(["*.java"]),
        deps = ["..."],
    )
    
    # import the rule you wrote.
    load('//:datanucleus.bzl', 'enhance')
    enhance("my-lib")
    

    Now you can do:

    bazel build //:my-lib-enhanced.jar
    

    and use my-lib-enhanced.jar as a dependency in other java_ rules.

    More info on .bzl files: https://bazel.build/versions/master/docs/skylark/concepts.html.


    Edited to add more info on depending on a jar:

    There are a couple of options to get a jar that contains the content of datanucleus. First, you don't need the layer of indirection: you can just say:

          srcs = [
              src, 
              "@datanucleus_core//jar"
          ],
    

    This will give you the actual jar.

    If, for some reason, you need the jar to be in third_party, you can modify third_party/BUILD to create a deploy jar, which is a java binary that bundles up all of its dependencies for deployment (since you're not actually going to using it as a binary, you can use whatever you want for the main class name):

    java_binary(
        name = "datanucleus-core",
        main_class = "whatever",
        runtime_deps = ["@org_datanucleus_datanucleus_core//jar"],
    )
    
    genrule(
        name = "your-lib",
        srcs = [":datanucleus-core_deploy.jar", ...],
    )
    

    The :datanucleus-core_deploy.jar is called an implicit target: it's only built if requested, but it can be generated from your java_binary declaration.