Search code examples
javabytecodejava-bytecode-asm

Stray instructions when reconstructing method using asm


I am using asm to modify the instructions of a MethodNode. My code constructs a graph from the methodNode.instructions. Using this graph, I rearrange and delete instructions. I then use the graph to generate a new list of instructions for the MethodNode. The problem is that the instructions still have values for .getNext() which results in stray instructions that are no longer contained in the graph to be added to the end of the methods instructions. This also causes a ArrayIndexOutOfBoundsException when converting the InsnList to an array.

Example:

Initial instructions:

L0
 ALOAD 1
 LDC 1784196469
 INVOKEVIRTUAL dq.c (I)I
 ISTORE 2
 ILOAD 2
 IFNE L1
 GOTO L2
L3
 RETURN
L1
 ALOAD 0
 ALOAD 1
 ILOAD 2
 BIPUSH 15
 INVOKEVIRTUAL ac.f (Ldq;IB)V
 GOTO L0
L2
 GOTO L3

After changes in graph. These are the instructions that are added to methodNode.instructions after the list is cleared.

L0
 ALOAD 1
 LDC 1784196469
 INVOKEVIRTUAL dq.c (I)I
 ISTORE 2
 ILOAD 2
 IFNE L1
 RETURN
L1
 ALOAD 0
 ALOAD 1
 ILOAD 2
 BIPUSH 15
 INVOKEVIRTUAL ac.f (Ldq;IB)V
 GOTO L0

When .getNext() is called on the last instruction, this is the result:

org.objectweb.asm.tree.LabelNode@7a0ac6e3

As you can see the last instruction added (GOTO L0) has a value for .getNext()

When this MethodNode is saved and then decompiled this is the result. There is a stray GOTO statement to a nonexistent label.

L1 {
    aload1
    ldc 1784196469 (java.lang.Integer)
    invokevirtual dq c((I)I);
    istore2
    iload2
    ifne L2
    return
}
L2 {
     aload0 // reference to self
     aload1
     iload2
     bipush 15
     invokevirtual ac f((Ldq;IB)V);
     goto L1
     goto L3
}

How can I reuse instructions from the method if I am changing the order of them? Is this a bug or am I using asm wrong?


Solution

  • InsnList is a linked list of AbstractInsnNode. Each instance of the AbstractInsnNode has a pointer called next that points to the next node in the list.

    When you rearrange nodes in a list, you will need to take care of these pointers as well. In your case, the node for GOTO L0 originally points to L2, and because its next pointer has not been updated during the rearrangement, it still points L2 at the end.

    There seems to be no easy solutions to update the next pointer. There is no setter method for it. The clear method doesn't do it. The removeAll method is not visible.

    Some possibilities might be to (i) use remove to remove all nodes from the list before reusing them; or (ii) construct new nodes for the new list.