Search code examples
javabinary-treebytecode-manipulationbyte-buddy

Dynamically generate a single function (without subfunctions), representing a binary expression tree, at run time with Byte Buddy


Introduction

I want to compare some libraries for generating code at run time. At the moment I touched the surface of Javassist and Byte Buddy.

As a proof of concept I am trying to solve a small problem, which is a starting point for a more complex one.
Basically I have a binary expression tree which I want to convert into a single line of code and load it into my java run time. For simplicity reasons I have only add nodes and constants as leafs so far.

Javassist Reference

I already have a way for doing this in Javassist (which at least works for a single node with two leafs). The code is looking like this:

public class JavassistNodeFactory{
    public DynamicNode generateDynamicNode(INode root){
        DynamicNode dynamicNode = null;
        try {
            CtClass cc = createClass();
            interceptMethod(root, cc);
            compileClass(cc);
            dynamicNode = instantiate(cc);
        }catch (Exception e){
            System.out.println("Error compiling class with javassist: "+ e.getMessage());
            e.printStackTrace();
        }
        return dynamicNode;
    }

    private DynamicNode instantiate(CtClass cc) throws CannotCompileException, IllegalAccessException, InstantiationException {
        Class<?> clazz = cc.toClass();
        return (DynamicNode) clazz.newInstance();
    }

    private void compileClass(CtClass cc) throws NotFoundException, IOException, CannotCompileException {
        cc.writeFile();
    }

    private void interceptMethod(INode root, CtClass cc) throws NotFoundException, CannotCompileException {
        CtMethod calculateMethod = cc.getSuperclass().getDeclaredMethod("calculateValue",null);
        calculateMethod.setBody("return "+ nodeToString(root)+ ";");
    }

    private CtClass createClass() throws CannotCompileException, NotFoundException {
        ClassPool pool = ClassPool.getDefault();
        CtClass cc = pool.makeClass(
                "DN"+ UUID.randomUUID().toString().replace("-","")
        );
        cc.setSuperclass(pool.get("org.jamesii.mlrules.util.runtimeCompiling.DynamicNode"));
        return cc;
    }

    private static String nodeToString(INode node){
        if (node.getName().equals("")){
            return ((ValueNode)node).getValue().toString();
        }else{
            List<? extends INode> children = node.getChildren();
            assert(children.size()==2);
            return ("("+nodeToString(children.get(0))+node.getName()+nodeToString(children.get(1))+")");
        }
    }
}

The DynamicNode class looks like this:

public class DynamicNode implements INode   {
    @Override
    public <N extends INode> N calc() {
        Double value = calculateValue();
        return (N) new ValueNode<Double>(value);
    }

    @Override
    public List<? extends INode> getChildren() {
        return null;
    }

    @Override
    public String getName() {
        return null;
    }

    private Double calculateValue() {
        return null;
    }
}

The important part is the nodeToString() function, where I generate an arithmetic formula represented by the returned string, from a given root node. TheValueNode is a leaf of the tree with a constant Value, which would be returned as a String.
Other nodes (only add nodes for my case) will call the function recursively for each child and print brackets arround the expression as well as printing the operator (returned by the getName() function) in the middle of the two children (in short: "(leftChild+rightChild)").
The body of the calculateValue() function will be altered in the interceptMethod() function by Javassist, to return the result of the generated formula.

Byte Buddy Attempt

I have played around with Byte Buddy to achieve a similar solution. But as I looked deeper into the concepts and the documentation, I felt more and more like this is not a problem Byte Buddy was designed for. The majority of examples and questions seem to concentrate on the function delegation to other functions (which actually exist already at compile time, and are only connected to at run time). This is not really convenient in my case, since I have no way of knowing the actual tree I want to convert, at compile time. It is probably possible to use the underlying ASM library, but I would like to avoid handling byte code by myself (and possible successors of mine).

I have a (obviously not working) basic implementation, but I am stuck at the point where I have to provide an Implementation for the intercept() function of the Byte Buddy library. My last state looks like this:

public class ByteBuddyNodeFactory{
    @Override
    public DynamicNode generateDynamicNode(INode root) {
        DynamicNode dynamicNode = null;
        try {
            Class<?> dynamicType = new ByteBuddy()
                    .subclass(DynamicNode.class)
                    .name("DN"+ UUID.randomUUID().toString().replace("-",""))
    //this is the point where I have problems
    //I don't know how to generate the Implementation for the intercept() function
    //An attempt is in the nodeToImplementation() function
                    .method(ElementMatchers.named("calculateValue")).intercept(nodeToImplementation(root))
                    .make()
                    .load(Object.class.getClassLoader())
                    .getLoaded();
            dynamicNode = (DynamicNode) dynamicType.newInstance();
        } catch (Exception e) {
            System.out.println("Error compiling testclass with bytebuddy: " + e.getMessage());
            e.printStackTrace();
        }
        return dynamicNode;
    }

    private Implementation.Composable nodeToImplementation(INode node){
        if (node.getName().equals("")){
            return (Implementation.Composable)FixedValue.value(((ValueNode)node).getValue());
        }else{
            List<? extends INode> children = node.getChildren();
            assert(children.size()==2);
            switch (node.getName()){
                case ("+"):
                    //This is the point where I am completely lost
                    //This return is just the last thing I tried and may be not even correct Java code
                    // But hopefully at least my intention gets clearer
                    return (MethodCall.invoke((Method sdjk)-> {
                        return (nodeToImplementation(children.get(0)).andThen(node.getName().andThen(nodeToImplementation(children.get(1)))));
                    }));
                
                default:
                    throw new NotImplementedException();
            }
        }
    }
}

My idea was to concatenate subfunctions together and therefore tried to work with the Composable Implementation. I tried to return a MethodDelegation but as I mentioned I got the feeling that this wouldn't be the right approach. After that I tried MethodCall but I soon realized that I have exactly no idea how to make things work with this one either^^

Question

Is it possible in Byte Buddy to generate a function from a tree structure as dynamically as in Javassist, without calling as many sub functions as I have nodes?
How would I do this, if possible?
And if it is not possible: is it possible with other byte code manipulation libraries like cglib.

I would prefer to stay an abstraction level above byte code, since the study of the underlying concepts should be irrelevant for my problem.


Solution

  • What you are trying to do is not easily possible with Byte Buddy's high-level APIs. Instead, you should assemble a method using StackManipulations if you want to use Byte Buddy. Stack manipulations do still contain Java byte code but these bits should be so trivial that they would be easy to implement.

    The reason that Byte Buddy does not aim for this scenario is that you can normally find a better abstraction for your code than to assemble code snippets. Why can your nodes not implement the actual implementation which is then called from your instrumented method? The JIT compiler does typically optimize this code to the same result as your manually inlined code. Additionally, you preserve debuggability and reduce the complexity of your code.