Search code examples
javajavassistbytecode-manipulation

How to generate looping bytecode using Javassist?


I am trying to write a compiler for an esoteric programming language that compiles to Java Bytecode. I'm trying to use Javassist to generate the bytecode.

I got stuck when trying to generate branching/looping code. For example, let's say I'm generating the code for:

while (true) System.out.println("Hello World!");

Here is my attempt:

var mainClass = new ClassFile(false, "Main", null);
var constPool = mainClass.getConstPool();
var mainMethodCode = new Bytecode(constPool);

int label = mainMethodCode.currentPc();
mainMethodCode.addGetstatic(ClassPool.getDefault().get("java.lang.System"), "out", "Ljava/io/PrintStream;");
mainMethodCode.addLdc("Hello World!");
mainMethodCode.addInvokevirtual("java.io.PrintStream", "println", "(Ljava/lang/String;)V");
mainMethodCode.addOpcode(Opcode.GOTO);
// I know that branch instructions take a PC-relative offset
// and after some trial and error, this seems to be the correct formula
var offset = label - mainMethodCode.currentPc() + 1;
mainMethodCode.addIndex(offset);

mainMethodCode.setMaxLocals(1);
var mainMethodInfo = new MethodInfo(constPool, "main", "([Ljava/lang/String;)V");
mainMethodInfo.setCodeAttribute(mainMethodCode.toCodeAttribute());
mainClass.addMethod(mainMethodInfo);
mainClass.setAccessFlags(AccessFlag.PUBLIC);
mainMethodInfo.setAccessFlags(AccessFlag.PUBLIC | AccessFlag.STATIC);
ClassPool.getDefault().makeClass(mainClass).writeFile(...);

By inspecting the class file, I can see that the expected bytecode is generated:

Code:
  stack=2, locals=1, args_size=1
     0: getstatic     #12                 // Field java/lang/System.out:Ljava/io/PrintStream;
     3: ldc           #14                 // String Hello World!
     5: invokevirtual #20                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
     8: goto          0

However, when running the class file using java Main, I get a VerifyError. I apparently need to add the goto target into a stack map (whatever that means).

Expecting a stackmap frame at branch target 0

I've found a StackMap.Writer class in Javassist, so I tried

var stackMap = new StackMap.Writer();
stackMap.write16bit(label); // does this add 0 (the value of label) to the stack map? 
...
var codeAttr = mainMethodCode.toCodeAttribute();
codeAttr.setAttribute(stackMap.toStackMap(constPool));
mainMethodInfo.setCodeAttribute(codeAttr);
...

However, the same VerifyError occurs when I try to run the class.

What is the intended way of generating branching code in Javassist?


Solution

  • Thanks to Holger's comment, I was able to figure out that I actually needed a StackMapTable, not a StackMap. And I indeed need a stack map table entry at every branch destination.

    var stackMap = new StackMapTable.Writer(0);
    stackMap.sameFrame(0);
    // ...
    var codeAttr = mainMethodCode.toCodeAttribute();
    codeAttr.setAttribute(stackMap.toStackMapTable(constPool));
    

    Note that there are different types of frames, and they all say different things about the operand stack, and what local variables are available. sameFrame is the type that indicates that the local variables are the same as the previous frame, and the operand stack is empty. Other types include appendFrame, chopFrame, fullFrame. For more info, see JVMS 4.7.4.

    In general, the argument 0 passed to sameFrame is not the bytecode offset at which I want the stack map table entry to apply to. Rather, it is the "offset delta". The bytecode offset that the entry applies to is computed by adding (offset delta + 1) to the bytecode offset that the previous frame applies to. Only for the first frame, is the offset delta same as the bytecode offset to which it applies.

    It seems like the ASM library is more suitable for bytecode generation like this. I don't need to calculate the PC offsets manually. It even has the option (COMPUTE_FRAMES) to work out which type of frame you should use, at the cost of performance.

    var cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES);
    
    cw.visit(V1_6,
        ACC_PUBLIC + ACC_SUPER,
        "Main",
        null,
        "java/lang/Object",
        null);
    
    cw.visitSource("Main.java", null);
    var mv = cw.visitMethod(ACC_PUBLIC + ACC_STATIC,
        "main",
        "([Ljava/lang/String;)V",
        null,
        null);
    mv.visitCode();
    var start = new Label();
    mv.visitLabel(start);
    mv.visitFieldInsn(GETSTATIC,
        "java/lang/System",
        "out",
        "Ljava/io/PrintStream;");
    mv.visitLdcInsn("Hello World!");
    mv.visitMethodInsn(INVOKEVIRTUAL,
        "java/io/PrintStream",
        "println",
        "(Ljava/lang/String;)V",
        false);
    mv.visitJumpInsn(GOTO, start);
    mv.visitMaxs(0, 0);
    mv.visitEnd();
    cw.visitEnd();
    var bytes = cw.toByteArray();
    var stream = new FileOutputStream("...");
    stream.write(bytes);
    stream.close();