Search code examples
javajvminstrumentationjava-bytecode-asmjavaagents

How to monitor object creation using java agent and ASM?


What I want to do is to monitor the object creation and record a unique ID for that object.

Firstly I tried to monitor the NEW instruction but it can not work and throw VerifyError: (...) Expecting to find object/array on stack. I heard that the object after NEW is uninitialized so it can not be passed to other methods. So I abandoned this approach.

Secondly, I tried to monitor the invocation of <init>, this method initializes the uninitialized object. But I am not sure that after the initialization, if the initialized object will be pushed to the stack?

In my method visitor adapter:

public void visitMethodInsn(int opc, String owner, String name, String desc, boolean isInterface) {
    ...
    mv.visitMethodInsn(opc, owner, name, desc, isInterface);
    if (opc == INVOKESPECIAL && name.equals("<init>")) {
        mv.visitInsn(DUP);
        mv.visitMethodInsn(INVOKESTATIC, "org/myekstazi/agent/PurityRecorder", "object_new",
                "(Ljava/lang/Object;)V", false);
    }
}

In MyRecorder.java:

public static void object_new(Object ref){
    log("object_new !");
    log("MyRecorder: " + ref);
    log("ref.getClass().getName(): " + ref.getClass().getName());
}

I tried them in a demo, it throws VerifyError:

Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.VerifyError: Operand stack underflow
Exception Details:
  Location:
    AbstractDemo.<init>()V @4: dup
  Reason:
    Attempt to pop empty stack.
  Current Frame:
    bci: @4
    flags: { }
    locals: { 'AbstractDemo' }
    stack: { }
  Bytecode:
    0x0000000: 2ab7 0001 59b8 003b b1

        at java.lang.Class.getDeclaredMethods0(Native Method)
        at java.lang.Class.privateGetDeclaredMethods(Unknown Source)
        at java.lang.Class.privateGetMethodRecursive(Unknown Source)
        at java.lang.Class.getMethod0(Unknown Source)
        at java.lang.Class.getMethod(Unknown Source)
        at sun.launcher.LauncherHelper.validateMainClass(Unknown Source)
        at sun.launcher.LauncherHelper.checkAndLoadMain(Unknown Source)

It seems not working as well. Are there any alternatives to monitor the object creation?


Solution

  • The part of the message

    Location:
      AbstractDemo.<init>()V @4: dup
    

    hints at it: you are instrumenting a constructor. Within a constructor, invokespecial <init> is also used to delegate to another constructor, either in the same class or in the superclass.

    The typical sequence for calling another constructor is aload_0 (this), push arguments, invokespecial <init>, so there is no reference to the object on the stack after the invocation.

    This is how the decoded bytecode of the VerifyError looks like:

      0 aload_0
      1 invokespecial   [1]
      4 dup
      5 invokestatic    [59]
      8 return
    

    Normally, you don’t want to report these delegating constructor calls, as they would cause reporting the same object multiple times. But identifying them can be tricky, as the receiver class is not a reliable criteria. E.g., the following is valid Java code:

    public class Example {
        Example reference;
        Example(Example anotherObject) {
            reference = anotherObject;
        }
        Example() {
            this(new Example(null));
            reference.reference = new Example(this);
        }
    }
    

    Here, we have a constructor containing three invokespecial instruction having the same target class and the delegating constructor call is neither the first nor the last one, so there is no simple-to-check property of the instruction itself telling you this. You have to identify the target providing instruction as aload of index zero, i.e. this, to understand whether an instruction is initializing the current instance, which is nontrivial when there are argument providing instructions in-between.

    That said, even outside the constructor there is no guaranty that the newly instantiated object is on the stack. It is usually the case when the instantiation is used in an expression context where the result is subsequently stored or used, but not in a statement context. In other words for a method like

    void test() {
        new Example();
    }
    

    naive compiler implementations (like javac) may generate the equivalent to the expression code, followed by a pop instruction, but other implementations (like ecj) could elide the preceding dup in this case, eliminating the need for the subsequent pop, as no reference will be on the stack after the invokespecial <init> instruction.

    A safer approach is to search for instruction sequences starting with new and leading to an invokespecial <init> (allowing nested occurrences). Then, inject a dup right after the new instruction and the invokestatic after the invokespecial.