I know concepts of stop-the-world, incremental, parallel, concurrent, (soft/hard) realtime garbage collectors. But I can't understanding mostly-concurrent GC. Is it different with concurrent GC? What's the difference? Why it is called mostly?
I know concepts of stop-the-world, incremental, parallel, concurrent, (soft/hard) realtime garbage collectors. But I can't understanding mostly-concurrent GC. Is it different with concurrent GC? What's the difference? Why it is called mostly?
Like so many other subjects, garbage collection is shrouded in a fog of terminological ambiguity. Boehm is particularly infamous for using conventional terms in unconventional ways but we should forgive him because he was pioneering the field at a time when the conventional meanings had not yet been ossified! :-)
As I understand it, stop-the-world GC refers to an algorithm that suspends all mutator threads for the entire duration of a GC cycle, e.g. when marking the entire heap. For example, the .NET Server GC does this and incurs huge 300ms pause times as a consequence. Incremental GCs perform a little bit of major GC work at each minor GC cycle, e.g. "major slices" in OCaml's GC. Parallel means the GC uses multiple threads to speedup the process of collecting garbage. Concurrent GC means the GC runs at the same time as the mutators, e.g. .NET workstation GC. Real-time is difficult to define, originally meant bounded maximum pause times but now also means minimum mutator utilization (MMU), to avoid the pathological problem of a GC that never pauses a mutator for long by never allowing it to run! According to Richard Jones' new book, an on-the-fly GC never suspends more than one mutator at a time (i.e. there is no stop-the-world phase) although I suspect he meant that mutators are suspended independently of each other. Finally, a mostly-concurrent GC is one that does suspend all mutator threads simultaneously but only for a short period of time and not for an arbitrarily-long GC cycle. Therefore, the mutators are allowed to run freely most of the time while the GC is running and, hence, it is called "mostly concurrent" GC.
The classification of "mostly concurrent" is important because most (all?) major GCs fall into this category because it provides a good trade-off between pause times and throughput. For example, the .NET workstation GC suspends all mutator threads when taking a snapshot of the global roots but resumes them while it marks and sweeps.