Search code examples
javaannotationslombok

How can I add methods to a Java class at compile time using annotations or similar techniques, similar to Lombok?


I'm looking for a solution to dynamically add methods to a Java class during compile time, similar to how Lombok operates with annotations. Specifically, I want to automate method generation based on custom annotations, such as creating getter methods for fields annotated with a custom annotation. My goal is to explore methods beyond Lombok and understand the best practices and tools available for achieving this during the compilation process. If anyone has insights or code examples for adding methods to a class at compile time, especially using annotations, I would appreciate their guidance.

I've explored Lombok and other annotation processors to understand how they dynamically add methods during compilation. While Lombok provides a solution for common scenarios, I'm interested in a more customized approach based on specific annotations. My expectation is to discover alternative techniques or tools that enable the dynamic addition of methods to a Java class at compile time, particularly in response to custom annotations. However, I haven't found a satisfactory solution yet, prompting me to seek insights, examples, or recommendations from the community.


Solution

  • Annotation Processors cannot (directly) do this; the 'spec' of annotation processors restricts you to making entirely new types, you can't edit existing ones.

    So what's lombok then?

    Lombok is a compiler plugin that transforms syntactically valid (but semantically probably invalid) ASTs. It isn't an annotation processor. However, it can plug into the compiler in a number of ways, and one of those is via the route of being initialized as an annotation processor.

    You can of course also do so - in which case, forking lombok is the right way to go, it's a complicated proposition and requires custom work for every tool you want to support (which in practice isn't nearly as dire as it sounds, as most tools simply use vanilla javac, so all you need to do is tell the tool to ensure lombok is plugged in). Still, lombok's got quite a bit of work on it, and the vast majority of it, is 'infrastructure' - ensuring that plugging in is possible and works appropriately.

    Are there alternatives?

    The usual trick done with Annotation Processors (APs) that follow the actual annotation processor APIs is to treat java code as a 'template', and have the entirely-generated-from-scratch source file (generated by the AP) to be the 'real' type used everywhere else. This is, for example, how auto-value works.

    The downside of this is that you require a complete build before any edit you make that would cause the generated content to be different, to actually show. In contrast to lombok where you don't even have to save the file, you edit something (say, change the name of a field annotated with @Getter) and the name of the lombok-generated getter method changes along instantaneously. Waiting around for the build isn't something anybody wants to do. Well, almost nobody:

    Yay, compiling!

    If you're willing to accept that severe penalty, then it looks something like this:

    // You actually write this class out manually:
    
    @AddDoohickeys
    class FooTemplate {
      public void addedByProgrammer() { ... }
    }
    

    and during compilation the AP sees that and makes this:

    // Generated entirely by AP
    public class Foo extends FooTemplate {
      public void addedByAP() { .. }
    }
    

    And all other code just uses Foo, and can use both addedByAP() as well as addedByProgrammer() equally. There are a few related takes on the idea (For example, the template can contain static void addedByProgrammer(Foo x) { ... }, and the AP generates the Foo class not with extends FooTemplate, instead adding a wrapper method, e.g. public void addedByProgrammer() { FooTemplate.addedByProgrammer(this); } - this avoids having to extend a class whose access rights make it otherwise invisible, if you find that sort of thing ugly.

    So, no, I don't want that, back to lombok

    Lombok is already very complicated, and you run into a serious problem I call the chicken-and-egg problem.

    For your compiler plugin to run, you really don't want to go off of 'sack of chars' style raw source files. It is incredibly difficult to reason about raw source code, you need to parse it into something more workable such as a tree structure. One can parse any source file into a so-called AST (Abstract Syntax Tree), but this isn't as workable as you might think. For example, this:

    public void foo(java.lang.String args) {
      c.a.v = 10;
    }
    

    Has an interesting thing in it. What is that c.a.v? With the full context, e.g. the classpath and the like, you can draw some wildly different conclusions:

    • c is a package, a is a class, and v is a public static non-final int field; this statement updates that field.
    • c is the name of a protected field from a supertype, a is a visible field in whatever the type of c was, and v is field in that.
    • c is a class name of a class in the same package as this code. a is a class name that is a nested class inside c, and v is a public static variable in that.

    The thing is, without that context, it is not possible to know which one because unfortunately java does not have rules about this. Some languages decree that classes must start with a capital letter, variables must start with an underscore, and package names must start with a lowercase letter - with such systems you can know what's happening without the context of a class path. Not so with java.

    So, in AST form, that is available in the tree as 'select field 'v' on node X' where X is 'select field 'a' on node Y' where Y is 'identity 'c''.

    You can't ask an AST: Hey, so, what type is 'v' then? You can't ask it: "Please list all members of whatever 'c' is" - that might not even make sense (if 'c' is a package, that's a weird question to ask).

    So, ASTs aren't all that nice either, which gets us to LSTs. In an LST, 'c.a.v' is properly attributes, with every part of that typed; once you have an LST, that LST represents, for example: "Write to field 'v' of nested class 'a' (which is of type com.foo.c$a), which is in outer class 'c', which is in package com.foo - the same package this code is in.

    LSTs are by far the most convenient and what you tend to think about when writing java code.

    However, once you have arrived at an LST you can no longer add stuff. After all, if you add a method now that can completely change the meaning of code to a fairly thorough degree. Simply adding a method can change an expression's meaning entirely: What used to look like foo(x) still looks like foo(x) but before, it was a reference to a static method in your parent type that returned a String, and now all of a sudden it is a reference to an instance method in your own type that returns an int, and if that expression is itself passed as arg to another method, that can therefore completely change which method that is 'bound' to.

    So, if you want to add a method but want an LST around to know what to do, you can do that, but.. you have to then invalidate the whole thing and tell the compiler to start over again. It can keep the AST (but, if you've added stuff by writing out java code, that needs to be parsed. If you add directly to the AST you can skip the expensive step of 'turn this sack of characters into an AST', but java does not have a unified AST so you're now going to have to write code to do the job of editing the AST at the very least for javac, which is non-standard API that breakingly changes all the time, as well as ecj, and that covers most but not all uses out there in the wild.

    Annotation Processing suffers from the same problem and introduces the convoluted 'rounds' concept to try to deal with this.

    Lombok 'deals' with it by avoiding it almost entirely - the vast majority of lombok features are written based on ASTs, not LSTs. That's why lombok doesn't have a feature to, for example, 'add a constructor that is just a wrapper around some constructor of the parent type, passing all params through'. Because doing that requires knowing the contents of a parent type, and that's LST territory.

    So, you can do this, but it is incredibly complicated.

    Some context

    In case it wasn't obvious, I'm one of the core authors of Project Lombok.