Search code examples
javajava-bytecode-asm

Method visitor is not working in Java ASM visitLineNumber()


I want to add a method call to each line of a specific class. To do this i want to use the ASM (visitor based) library.

The not working part means the code (Method call) is not inserted.

My (not working) code in the MethodVisitor class looks like this so far:

@Override
public void visitLineNumber(int line, Label start) {
  mv.visitMethodInsn(
      Opcodes.INVOKESTATIC,
      classpath,
      "visitLine",
      "()V",
      false);
  super.visitLineNumber(line, start);

I tried another method of the MethodVisitor and it worked just fine like this:

@Override
public void visitInsn(int opcode) {
  mv.visitMethodInsn(
          Opcodes.INVOKESTATIC,
          classpath,
          "visitLine",
          "()V",
          false);
  super.visitInsn(opcode);
}

My question is: Why does the first thing not work but the second?

edit: More context:

I want to insert the method call visitLine() in every line of code. A possible example class is this one:

public class Calculator {
  public int evaluate(final String pExpression) {
    int sum = 0;
    for (String summand : pExpression.split("\\+")) {
      sum += Integer.parseInt(summand);
    }
    return sum;
  }
}

becomes:

public class Calculator {
  public int evaluate(final String pExpression) {
    OutputWriter.visitLine();
    int sum = 0;
    OutputWriter.visitLine();
    for (String summand : pExpression.split("\\+")) {
      OutputWriter.visitLine();
      sum += Integer.parseInt(summand);
    }
    OutputWriter.visitLine();
    return sum;
  }
}

I have a basic setup of ClassReader, ClassWriter and ClassVisitor like this:

ClassWriter cw = new ClassWriter(0);
ClassReader cr = new ClassReader(pClassName);
ClassVisitor tcv = new TransformClassVisitor(cw);
cr.accept(tcv, 0);
return cw.toByteArray();

In the MethodVisitor i only override this method:

@Override
  public void visitLineNumber(int line, Label start) {
    System.out.println(line);
    mv.visitMethodInsn(
            Opcodes.INVOKESTATIC,
            classpath,
            "visitLine",
            "()V",
            false);
    super.visitLineNumber(line, start);
  }

This prints out all the lines of the visited classes but the method call i wanted to add was not added or at least is not executed.

edit: Found out some new stuff:

The visitLineNumber insertion works if it doesn't insert something in the last line of a method.

For example the calculator class from above: As long as there is no code inserted in line 7 (the return line) the code works fine. I tried another class with 2 return statements and it also worked fine until it reached the last return statement.

I think there is an error with the order the method call is inserted. Maybe it is inserted after the return statement which leads to an error when validating the class file.

Any new ideas on the topic?


Solution

  • There are two issues here.

    First, it seems that when Instrumentation.retransformClasses is invoked, errors with the transformed code, like VerifyError, are not reported by the JVM, but instead, it will simply proceed with the old code.

    I don’t see any way to change the JVM’s behavior here. It’s worth creating an additional test environment, where you use a different method to activate the code, like load-time transformation or just transforming the compiled classes statically and try to load these. This may be in addition to the production code which uses the same transformation code with retransformClasses, once these tests show no errors.

    By the way, when you implement a ClassFileTransformer, you should pass the byte[] array you received as parameter for the transform method to the ClassReader(byte[]) constructor instead of using the ClassReader(String) constructor.


    Second, the code location of the last reported line number is also a branch target. Keep in mind that line breaks do not generate code, so the end of the loop is identical with the start of the return statement.

    ASM will report the associated artifacts in the following order:

    • visitLabel with a Label instance associated with the code location
    • visitLineNumber with the new line number and the Label from the previous step
    • visitFrame to report a stack map frame associated with this code location (as it is a branch target)

    You are inserting a new instruction at the visitLineNumber call, which causes the branch target to be before this new instruction, as you delegated the visitLabel before it. But the visitFrame call gets delegated after the new instructions have been inserted, hence, is not associated with the branch target anymore. This causes a VerifyError, as it is mandatory to have a stackmap frame for each branch target.

    A simple, but expensive solution would be, not to use the original class’ stackmap frames, but let ASM recompute them. I.e.

    public static byte[] getTransformed(byte[] originalCode) {
        ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES);
        ClassReader cr = new ClassReader(originalCode);
        ClassVisitor tcv = new TransformClassVisitor(cw);
        cr.accept(tcv, ClassReader.SKIP_FRAMES);
        return cw.toByteArray();
    }
    

    By the way, when you keep most of the original code but only insert some new statements, it’s beneficial to optimize the process by passing the ClassReader to the ClassWriter’s constructor:

    public static byte[] getTransformed(byte[] originalCode) {
        ClassReader cr = new ClassReader(originalCode);
        ClassWriter cw = new ClassWriter(cr, ClassWriter.COMPUTE_FRAMES);
        ClassVisitor tcv = new TransformClassVisitor(cw);
        cr.accept(tcv, ClassReader.SKIP_FRAMES);
        return cw.toByteArray();
    }
    

    A more efficient solution which does not recalculate the stackmap frames (as the original frames are still suitable for such a simple transformation), is not that easy with ASM’s API. The only idea I have so far, would be to postpone the insertion of the new instruction until the frame has been visited, if there is one. Unfortunately, that implies overriding all visit methods for instructions:

    Stay with

    public static byte[] getTransformed(byte[] originalCode) {
        ClassReader cr = new ClassReader(originalCode);
        ClassWriter cw = new ClassWriter(cr, 0);
        ClassVisitor tcv = new TransformClassVisitor(cw);
        cr.accept(tcv, 0);
        return cw.toByteArray();
    }
    

    and use

    static class Transformator extends MethodVisitor {
        int lastLineNumber;
    
        public Transformator(MethodVisitor mv) {
            super(Opcodes.ASM5, mv);
        }
        public void visitLineNumber(int line, Label start) {
            lastLineNumber = line;
            super.visitLineNumber(line, start);
        }
        private void checkLineNumber() {
            if(lastLineNumber > 0) {
                mv.visitMethodInsn(Opcodes.INVOKESTATIC, classpath,"visitLine","()V", false);
                lastLineNumber = 0;
            }
        }
        public void visitTryCatchBlock(Label start, Label end, Label handler, String type) {
            checkLineNumber();
            super.visitTryCatchBlock(start, end, handler, type);
        }
        public void visitMultiANewArrayInsn(String descriptor, int numDimensions) {
            checkLineNumber();
            super.visitMultiANewArrayInsn(descriptor, numDimensions);
        }
        public void visitLookupSwitchInsn(Label dflt, int[] keys, Label[] labels) {
            checkLineNumber();
            super.visitLookupSwitchInsn(dflt, keys, labels);
        }
        public void visitTableSwitchInsn(int min, int max, Label dflt, Label... labels) {
            checkLineNumber();
            super.visitTableSwitchInsn(min, max, dflt, labels);
        }
        public void visitIincInsn(int var, int increment) {
            checkLineNumber();
            super.visitIincInsn(var, increment);
        }
        public void visitLdcInsn(Object value) {
            checkLineNumber();
            super.visitLdcInsn(value);
        }
        public void visitJumpInsn(int opcode, Label label) {
            checkLineNumber();
            super.visitJumpInsn(opcode, label);
        }
        public void visitInvokeDynamicInsn(
            String name, String desc, Handle bsmHandle, Object... bsmArg) {
            checkLineNumber();
            super.visitInvokeDynamicInsn(name, desc, bsmHandle, bsmArg);
        }
        public void visitMethodInsn(
            int opcode, String owner, String name, String desc, boolean iface) {
            checkLineNumber();
            super.visitMethodInsn(opcode, owner, name, desc, iface);
        }
        public void visitFieldInsn(int opcode, String owner, String name,String descriptor) {
            checkLineNumber();
            super.visitFieldInsn(opcode, owner, name, descriptor);
        }
        public void visitTypeInsn(int opcode, String type) {
            checkLineNumber();
            super.visitTypeInsn(opcode, type);
        }
        public void visitVarInsn(int opcode, int var) {
            checkLineNumber();
            super.visitVarInsn(opcode, var);
        }
        public void visitIntInsn(int opcode, int operand) {
            checkLineNumber();
            super.visitIntInsn(opcode, operand);
        }
        public void visitInsn(int opcode) {
            checkLineNumber();
            super.visitInsn(opcode);
        }
    }
    

    Unfortunately, ASM’s visitor model has no preVisitInstr() or such alike.

    Note that with this design, it is also impossible to erroneously inject instructions after the method’s last instruction as the injected instruction(s) are always placed before another instruction.