I have been reading up on (stack based) virtual machines lately. Something I couldn't find a good answer for was the following:
At what level is a garbage collector usually implemented?
Thinking about this I came up with the two following options:
- Implement the GC at language level.
- Implement the GC at VM level. This would mean having special instructions for requesting memory/objects. The lifetime of these objects would then be managed by the VM, based on the references to the object.
Are these both valid options? And if so, which one is usually used for certain cases?
Both options are available, but it depends on the language and your goals.
In some languages, such as C, C++ (see Boehm GC) and Rust (see rust-gc), the GC is implemented as a library. In other languages, such as C# (see CoreCLR and Mono), Java and Ruby (see their repo), it is implemented in the runtime environment.
I'm sure there are more examples and possibly counter-examples, too, but I believe at least a few observations can be made about what factors play a role in the decision:
- For a GC to be written at the language level, it must be somehow optional (even if it is on by default). After all, the GC needs to allocate memory for its own correct operation - thus, it has to be written in a language that's at least usable without a GC.
- The GC is the memory manager, so it's present and probably frequently used during the operation of almost every program in that language - thus, it can be reasonably considered a performance-critical piece of code. While there's no hard rule saying that a VM-based language implementation is necessarily less efficient than a natively compiled one, in practice, VM-based implementations lag behind somewhat.