I'm working on the byte code compiler for Renjin (R for the JVM) and am experimenting with translating our intermediate three address code (TAC) representation to byte code. All the textbooks on compilers that I've consulted discuss register allocation during code generation, but I haven't been able to find any resources for code generation on stack-based virtual machines like the JVM.
Simple TAC instructions are trivial to translate into bytecode, but I get a bit lost when temporaries are involved. Does any one have any pointers to resources that describe this?
Here is a complete example:
Original R code looks like this:
x + sqrt(x * y)
TAC IR:
0: _t2 := primitive<*>(x, y)
1: _t3 := primitive<sqrt>(_t2)
2: return primitive<+>(x, _t3)
(ignore for a second the fact taht we can't always resolve function calls to primitives at compile time)
The resulting JVM byte code would look (roughly) something like this:
aload_x
dup
aload_y
invokestatic r/primitives/Ops.multiply(Lr/lang/Vector;Lr/lang/Vector;)
invokestatic r/primitives/Ops.sqrt(Lr/lang/Vector;)
invokestatic r/primitives/Ops.plus(Lr/lang/Vector;Lr/lang/Vector;)
areturn
Basically, at the top of the program, I already need to be thinking that I'm going to need local variable x at the beginning of the stack by the time that i get to TAC instruction 2. I can think this through manually but I'm having trouble thinking through an algorithm to do this correctly. Any pointers?
Transforming a 3-address representation into stack is easier than a stack one into 3-address.
Your sequence should be the following: