Search code examples
compiler-constructiongarbage-collectioncode-generationstack-machinegc-roots

How to find gc roots in a stack machine?


I am writing a compiler for a fairly standard stack machine. Now I want to add a garbage collector. I can see that I could generate some sort of 'stack maps' to know which variables are gc roots in each activation record. However, I have no idea how to deal with intermediate values pushed in the stack during execution. The language I am compiling is Pascal-like so I don't need and I don't want to use tags to identify pointers from other data types.

I would appreciate any hints/pointers on how to

  1. Find gc roots in the stack at any point in time (i.e., how to identify which of the intermediate values which have been pushed in the stack are gc roots).
  2. Usual forms of encoding this information (i.e., how to generate and encode 'stack maps')

Thank you very much! Nicolas


Solution

  • A simple solution is to explicitly store the type of each stack entry. Then you don't need a stack map; if the type is "reference" then the entry is a GC root. This approach is particularly handy for debugging, because you can easily display the (typed) contents of the stack.

    If you really want to use stack maps, a simple solution is to generate a stack map to go with every instruction. You do this by keeping track of the stack contents while compiling, or by taking a second pass over the compiled instructions. Then when looking for GC roots, for each frame on the stack, you use the map that goes with the current instruction.