java bytecode java-bytecode-asm bytecode-manipulation

Any way to regenerate stackmap from byte code?

I have an old library (circa 2005) that performs byte code manipulation, but does not touch the stackmap. Consequently my jvm (java 8) complains that they are invalid classes. Only way to circumvent the errors is to run the jvm with -noverify. But that is not a long term solution for me.

Is there someway I can regenerate the stack map after the classes have already been generated? I saw the ClassWriter class had an option to regenerate the stack map, but I'm not sure how to read in a byte class and rewrite a new one. Is that feasible?

Solution

When you instrument old classes not having stackmaps and keep their old version number, there will be no problem, as they will be processed by the JVM the same way as before, not requiring stackmaps. Of course, this implies that you can’t inject newer bytecode features.

When you are instrumenting newer class files which had valid stackmaps before the transformation, you will not be running into those problems described by Antimony. So you can use ASM to regenerate stackmaps:

byte[] bytecode = … // result of your instrumentation
ClassReader cr = new ClassReader(bytecode);
ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES);
cr.accept(cw, ClassReader.SKIP_FRAMES);
bytecode = cw.toByteArray(); // with recalculated stack maps

The visitor API has been designed to allow easy chaining of a reader with a writer and only add code to intercept those artifacts you want to change.

Note that since we know that we are going to regenerate the stackmap frames from scratch using ClassWriter.COMPUTE_FRAMES, we can pass ClassReader.SKIP_FRAMES to the reader to tell it not to process the source frames we’re going to ignore anyway.

There is another optimization possible when we know that the class structure doesn’t change. We can pass the ClassReader to the ClassWriter’s constructor to draw a benefit from the unchanged structure, e.g. the target constant pool will get initialized with a copy of the source constant pool. This option, however, must be handled with care. If we don’t intercept methods at all, it will get optimized too, i.e. the code gets copied entirely without even recalculating the stack frames. So we need a custom method visitor to pretend that the code could potentially change:

byte[] bytecode = … // result of your instrumentation
ClassReader cr = new ClassReader(bytecode);
// passing cr to ClassWriter to enable optimizations
ClassWriter cw = new ClassWriter(cr, ClassWriter.COMPUTE_FRAMES);
cr.accept(new ClassVisitor(Opcodes.ASM5, cw) {
    @Override
    public MethodVisitor visitMethod(int access, String name, String desc,
                                     String signature, String[] exceptions) {
        MethodVisitor writer=super.visitMethod(access, name, desc, signature, exceptions);
        return new MethodVisitor(Opcodes.ASM5, writer) {
            // not changing anything, just preventing code specific optimizations
        };
    }
}, ClassReader.SKIP_FRAMES);
bytecode = cw.toByteArray(); // with recalculated stack maps

This way, unchanged artifacts like the constant pool can be copied directly to the target byte code while the stackmap frames still get recalculated.

There are some caveats, though. Generating stackmaps from scratch implies not utilizing any knowledge about the original code structure or the nature of the transformation. E.g. a compiler would know the formal types of local variable declarations whereas the ClassWriter may see different actual types for which it has to find the common base type. This search may be very expensive, cause the loading of classes which were deferred or not even be used during normal execution. The resulting type may even differ from the common type declared in the original code. It will be a correct type, but may again change the use of classes in the resulting code.

If you are performing the instrumentation in a different environment, ASM’s attempts to load the classes for determining the common type may fail. Then, you will have to override ClassWriter.getCommonSuperClass(…) with an implementation which can perform the operation in that environment. This is also the place to add optimizations, if you have more knowledge about the code and can provide answers without expensive searches through the type hierarchy.

Generally, it’s recommended to refactor that old library to use ASM in the first place instead of needing a subsequent adaption step. As explained above, when performing the code transformation using a chain of ClassReader and ClassWriter with optimizations enabled, ASM would be able to copy all unchanged methods, including their stackmaps, and only recalculate the stackmaps of actually changed methods. In the code above, doing the recalculation in a subsequent step, we had to disable the optimization as we don’t know anymore which methods were actually changed.

The next logical step would be to incorporate stackmap handling into the instrumentation, as more than often the knowledge about he actual transformation allows to keep 99% of the existing frames and easily adapt the others, instead of needing an expensive recalculation from scratch.