c++garbage-collection abstract-syntax-tree interpreter mark-and-sweep

Stop-the-world garbage collection for AST-interpreted language: prevent application threads waiting for one

I’m developing an AST-interpreted scripting language in C++. The interpreter has a simple stop-the-world mark-and-sweep garbage collector which, whenever collection is triggered, sends a stop request to all the application threads and then waits for all of them to be paused. Each thread has only one safe-point where it can honor the gc’s request, placed in a method exec() that is called every time a line of interpreted code is being executed, like the following:

void Thread::exec(const Statement *stmt){
    if(runtime->gcPauseRequested){
        this->paused = true;
        gcCallback.notify_one(); //notify GC that this thread is now waiting
        gcConditionVariable.wait(gcLock); //wait for GC to be finished
        this->paused = false;
    }
    // execute statement...
}

and the garbage collector:

void MemoryManager::gc(){
    runtime->gcPauseRequested = true;
    while(!allThreadsArePaused()){
        gcCallback.wait(gcCallbackLock);
    }
    runtime->gcPauseRequested = false;
    //garbage collect and resume threads...

}

Here is the problem: the language supports native function calls, but with the current system if a thread is performing a native call that is taking a long time (e.g. a native sleep function), all the other application threads and the garbage collector thread will be waiting for that thread to get to the safe point so that garbage collection can be performed. Is there a way to avoid this?

Solution

Is there a way to avoid this?

Not with your current design, and the apparantly opaque properties (can't see/touch inside) of your "native" code.

Your design is simple: each thread must occasionally be in a "safe" place where it isn't allocating objects that can be known by your language, and it isn't holding pointers to such objects in places that can't be seen by the GC. You ensure that by insisting on a thread protocol that forces each thread to periodically check if a GC is desired, at place you have designed to be safe for that thread.

Your native functions being called are simply not following your protocol. They can do two bad things: a) allocate interpreted language objects, and b) hold pointers to such objects in the opaque state (registers, variables in stack frames not seen by the GC, variables in objects allocated outside of what your memory manager allocates, ...) of the native function.

Given that these actions violate the protocol, you presumably can't fix this if you leave the allocator and the native code alone.

So, you either have to change your protocol to something else [and still figure out a solution], or change what the allocator and native code do.

You might solve a) by insisting the GC and the memory allocator share a lock, such that only one can be active at any one time. This will prevent your native code from allocating while the GC is running. This may add additional overhead to your memory allocator; maybe not since it presumably has to have defense against multiple threads running interpreted code and all attempting to allocate objects at the same time. Even if you have a thread local allocator, at some point that local allocator must run out of space and try to get more from a pool shared by all the threads, e.g., the one the OS provides.

You can solve b) by insisting the native code occasionally stores all the pointers it is holding in its opaque state back into a public place where they can be seen by the GC, and pausing just like the interpreter threads do.

And more complex way to insist on pointer safety in the native threads is to build a memory map (best done offline) of their content, that labels each machine instruction (or cache line containing code) with a boolean: "safe to GC here" or "not safe to GC here". Then the GC stops each thread, asks if is running in native code, if so fetches the PC and checks out the corresponding boolean flag. If safe, proceed with GC. If not, single step the thread to the next instruction and check the revised PC. Yes, this is pretty tricky logic. And how you figure which instructions are "safe" vs. "not safe" is an additional (fairly big) problem; if there are parts of the native code for which you don't know the answer, you can always go conservative and mark "not safe to GC here". You are still counting on the native code to not go into some kind of loop that doesn't have any "safe" points, or at least to not do that very often.

If you take this second approach, you can use in your interpreter, too. This would avoid the extra overhead of each interpreter thread polling the GC flag after every statement. As you tune your interpreter for speed (you'll discover you want to do that as soon as you get it running), you'll discover that polling gets to be growing fraction of the runtime overhead.