Search code examples
javajava-bytecode-asm

Bytecode manipulation/enhancement and Java Instrumentation API


I'm having a hard time to wrap my head around the dependency between Bytecode manipulation/enhancement and Java Instrumentation API.

Based on my understanding to do any bytecode manipulation/enhancement we have two options

  • Build-time - Java classes were compiled to *.class then some other library/application should be executed to do the manipulation.
  • Load-time - Only by making use of the Java Instrumentation API, which means a particular javaagent must be provided.

The things I'm not sure about:

  • Is there such thing as build-time bytecode manipulation and what are some frameworks/libs which support that (e.g. Javassist, ASM) Do they use some common approach or just reading and parsing the bytecode and then provide you a way to modify it?

  • Does load-time manipulation rely only on the Java Instrumentation API? Meaning all the available frameworks/libs (e.g. Javassist, ASM) use javaagent to do the manipulation?

Please note that I have a very small experience with this topic, so there is a chance I misunderstood or missed some concepts. I'm trying to boil down this complex topic to some simple explanation even if it will be very general or demonstrated using an analogy.


Solution

  • Think of a compiled Java class file as a byte[] array that contains any information of a specific Java class. Instrumentation in this context refers to the process of post-processing this byte array into a different shape, independently of when or how this happens. Instrumentation can be applied during any time between compilation and class loading; in Java, a class can even be instrumented after it has been loaded with the limitation of not changing its shape, i.e. adding/removing fields or methods. But no matter when instrumentation is applied, the concept remains the same, i.e. rearranging a byte array that represents a compiled Java class.

    Any byte code manipulation library that I am aware of, allows to process class files from any source. Typically, the most generic input to these libraries is a simple byte array which can optionally be loaded from a class loader for convenience. A class file can be looked up from a class loader via the ClassLoader.getResourceAsStream method with the name of the class file as argument. For example:

    classLoader.getResourceAsStream("some/Sample.class")
    

    should resolve the class file for an imaginary some.Sample class. This typically works as class files (the byte array) needs to be located by the class loader for loading the class when it is requested for the first time.

    During build-time, class files are normally located in a specific folder, e.g. in the target/classes folder of a Maven build. To instrument those classes, you only need to find those files, read them into a byte array, and then write back the changed result. You can do this for example by writing your own Maven plugin in which you could for example use ASM to adjust the files. For convenience, you can however also use a more high-level library such as Byte Buddy's Maven plugin into which you can load your own plugin and avoid the Maven plugin API and even byte code APIs entirely. (For information, I am the author of Byte Buddy.)

    During runtime, you could do a very similar same thing, i.e. locate class files that are located in some folder or jar file, find these classes and adjust them before they are loaded by the application. This would however not always work well since jar files might also be used by other applications that would also be affected. Additionally, it would require your user to explicitly activate this instrumentation from their application. Therefore, class file instrumentation is often applied using a Java agent what gives access to the Instrumentation API what makes this much more convenient. The API allows to install a hook into Java's internal class loading mechanism which makes it possible to adjust a class's byte array right before it is loaded:

    instrumentation.addClassFileTransformer(
      (Module module, ClassLoader loader, String name, 
       Class<?> classIfLoaded, ProtectionDomain pd, byte[] classFile) -> {
         byte[] transformed = doSomethingWith(classFile);
         return transformed;
    });
    

    This change is then isolated to the application and does not change the original class files. The instrumentation API does not imply the use of any library to modify a class file, this is fully up to you and everybody is using a library of some sort or even to manipulate the byte array directly. A high-level library such as Byte Buddy does not even require you to implement your own class file transformer but has it's own abstraction via the AgentBuilder API which does however create a class file transformer under the covers to make use of the instrumentation API's unique capabilities. Other libraries such as ASM or Javassist have no relationship to the Instrumentation API however and would require you to implement your own class file transformer in which you use the APIs of those libraries to process the presented class file.