Search code examples
javainstrumentationjavaagents

How to speed up runtime Java code instrumentation?


I made a Java agent which is attached to a JVM during runtime and instruments all the loaded project classes and inserts some logging statements. There are 11k classes in total. I measured the total time taken by the transform method of my ClassFileTransformer and it was 3 seconds. But the duration of the whole instrumentation process takes about 30 seconds. This is how I retransform my classes:

 instrumentation.retransformClasses(myClassesArray);

I assume most time is taken up by the JVM to reload changed classes. Is that right? How can I speed up the instrumentation process?

Update:
When my agent is attached,

instrumentation.addTransformer(new MyTransfomer(), true);
instrumentation.retransformClasses(retransformClassArray);

is called only once.

Then MyTransfomer class instruments the classes and measures the total duration time of instrumentation:


public class MyTransfomer implements ClassFileTransformer {
private long total = 0;
private long min = ..., max = ...;

public final byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classFileBuffer) {
   long s = System.currentTimeMillis();
   if(s < min) min = s;
   if(s > max) max = s;
   byte[] transformed = this.transformInner(loader, className, classFileBuffer);

   this.total += System.currentTimeMillis() - s;
   
   return transformed;
  }
}

After all the classes are instrumented(from the initial array) (a global cache keeps track of the instrumented classes) total is printed and it will be ~3 seconds. But max-min is ~30 seconds.

Update 2:

After looking at the stack trace this is what happens: I call

instrumentation.retransformClasses(retransformClassArray);

which calls the native method retransformClasses0(). After some time(!) the JVM calls the transform() method of the sun.instrument.InstrumentationImpl class(but this method takes only one class at a time, so the JVM calls this method multiple times consecutively), which calls transform() on the sun.instrument.TransformerManager object which has a list with the all the ClassTransformers registered and calls each of these transformers to transform the class(I have only one transformer registered!!).

So to my opinion, most of the time is spent in the JVM (after retransformClasses0() is called and before each call to sun.instrument.InstrumentationImpl.transform()). Is there a way to reduce the time needed by the JVM to carry out this task?


Solution

  • Correction:

    Because the retransformClasses(classArr) will not retransform all the elements in the classArr at once, instead it will retransform each of them as needed(eg. while linking).(refer to the jdk [VM_RedefineClasses][1] and [jvmtiEnv][2]), it does retransform all of them at once.

    What retransformClasses() does:

    1. Transfer control to native layer, and give it a class list which we want to transform
    2. For every class to be transformed, the native code tries to get a new version by calling our java transformer, this leads to a transfer of control between the java code and native.
    3. The native code replace the appropriate parts of internal representation by the given new class version one another.

    In step 1:

    java.lang.instrument.Instrumentation#retransformClasses calls sun.instrument.InstrumentationImpl#retransformClasses0 which is a JNI method, the control will be transferred to native layer.

    // src/hotspot/share/prims/jvmtiEnv.cpp
    jvmtiError
    JvmtiEnv::RetransformClasses(jint class_count, const jclass* classes) {
      ...
      VM_RedefineClasses op(class_count, class_definitions, jvmti_class_load_kind_retransform);
      VMThread::execute(&op);
      ...
    } /* end RetransformClasses */
    
    

    In step 2:

    This step is implemented by KlassFactory::create_from_stream, this procedure will post a ClassFileLoadHook event whose callback can acquire the transformed bytecode by invoking the java transformer method. In this step, the control will switch back and forth between native code and java code.

    // src/hotspot/share/classfile/klassFactory.cpp
    // check and post a ClassFileLoadHook event before loading a class
    // Skip this processing for VM hidden or anonymous classes
    if (!cl_info.is_hidden() && (cl_info.unsafe_anonymous_host() == NULL)) {
      stream = check_class_file_load_hook(stream,
                                          name,
                                          loader_data,
                                          cl_info.protection_domain(),
                                          &cached_class_file,
                                          CHECK_NULL);
    }
    
    //src/java.instrument/share/native/libinstrument/JPLISAgent.c :
    //call java code sun.instrument.InstrumentationImpl#transform
    transformedBufferObject = (*jnienv)->CallObjectMethod(
       jnienv,
       agent->mInstrumentationImpl, //sun.instrument.InstrumentationImpl
       agent->mTransform, //transform
       moduleObject,
       loaderObject,
       classNameStringObject,
       classBeingRedefined,
       protectionDomain,
       classFileBufferObject,
       is_retransformer);
    

    In step 3:

    VM_RedefineClasses::redefine_single_class(jclass the_jclass, InstanceKlass* scratch_class, TRAPS) method replaces parts (such as constant pool, methods, etc.) in target class with parts from transformed class.

    // src/hotspot/share/prims/jvmtiRedefineClasses.cpp
    for (int i = 0; i < _class_count; i++) {
      redefine_single_class(_class_defs[i].klass, _scratch_classes[i], thread);
    }
    

    So how to speed up runtime Java code instrumentation?

    In my project, the total time and max-min time are almost the same if the app is in a paused state while transforming. can you provide some demo code?

    It's impossible to change the way jvm works, so multithreading may not be a bad idea. It got several times faster after using multithreading in my demo project.