A big question on multi-threading and its nuances. What is it and how to work with it properly? Mutex, semaphores

I read the documentation. Now, I kind of understand what threading a function does in Godot. I even more or less succesfuly use it (but after adding some more bits to the code, the project just crashes). I read through the thread-safe API too, and that is a gray zone for me. Mutexes are as well. And Semaphores. I'm not an IT guy, so it's more like an Explain like I'm 5)

What specifically I can or cannot use threading with? Is it ok to call another node's function? Or should I put a new thread directly in that node?
What are Mutexes? What locking and unlocking a mutex does?
What are Semaphores?
When to use what?

Solution

As if you were 5? OK... In the land of code...

Single-Threaded vs Multi-Threaded

The execution of the code flows from one instruction to the next. Sometimes it jumps skipping some conditional sections (when there is a if or a match statement), or it goes in a loop (when there is a for or a while statement). But no matter what, it is always execution a single instruction at a time.

This is what we call, "single threaded" execution. But one day, "multi-threaded" came to the land of code...

When there are multiple threads, each has its own execution flow. While one thread is on one instruction, the other is in some other instruction.

This might be good, because it might allow the code to do more work in the same time. But it might also be bad, because what one thread is doing might mess with the others.

Thus care must be put so threads don't tangle...

On Main Thread

Even though there is multi-threading, there is still the main thread, which is the same you had when execution was single threaded. And many things are bound to the main thread (notably the UI is bound to the thread that created the window, and the first window is created by the main thread). And trying to access them from another thread will cause problems.... From the mundane to the nasty.

Deferred execution

Our main tool for that is deferred execution, because it will always happen on the main thread, regardless of which thread requested it. And since a thread is only doing one thing at a time, there won't be trouble.

Deferred calls a good way for other threads to deliver results to main thread. But doing everything with deferred execution squanders the potential of threads.

Since the deferred call will happen sometime later on the main thread, from the point of view of the secondary threads, the deferred call is fire and forget (the secondary thread can't await it, or get a result from it, it continues executing right away).

You might use a deferred call from a secondary thread to give it Callable. But if the main thread calls the Callable, it would be running in the main thread. I'm telling you that this is not a workaround.

I want you to imagine this scenario: You have some complex computation, or you doing some large IO operation (e.g. reading a large file), which will take some time...

If you do that in the main thread, it will be busy doing that, and won't update the rest of your game/application. As a result, the user will see that the game/application freezes.

To avoid that, you might use a secondary Thread, where you do the work. But at the end you want to show a result, or you want to show some progress bar while it is running. To do that, the secondary thread can use a deferred call.

Pre-emptive threading

To be able to have multiple threads work on the same values, we need to understand threads better. Notably threading is pre-emptive (usually), which mean thread might be suspended and resumed at any moment. Sometimes under your control, sometimes beyond your control. As a result whatever a thread is doing might be temporally left in an unfished state.

Some instructions are atomic, meaning they cannot be subdivided, and thus a thread either does them or not. They cannot be left half done. Sadly it might depend on the amount of data and the platform (sometimes reading or writing is not a single operation, but must be done in chunks)… So, unless explicitly stated that something is guaranteed to be atomic, don't rely on that.

Thus, one of the thing that might happen is that a thread might be writing a variable, but only write it half way, and another thread read it like that and... oh no, undefined behavior! chaos! crashes!

Something that won't be atomic (usally), is incrementing a variable. It can be broken down into reading the value of the variable, compute the incremented value, and then writing it.

In this case, one thread might read the value, compute the incremented value... And preempted! Another thread comes in, read the value, computes the the incremented value and writes it. Now the first thread resumes and writes the value it computed... But overwrites the work of the second thread! The first thread didn't know the second changed the value! As a result the value was incremented only once, not twice.

This is known as the ABA problem, see also Time-of-check to time-of-use. It is also an example of a race condition.

Locks

If you need a secondary thread to wait for something, please use a lock (mutex or semaphore).

We want other threads to wait until we are done computing and writing values before they read the results. And also to stop them from writing anything that might mess with the work we are doing in a thread.

We call where we want to control the access of threads "Critical Sections".

And we have tools for that!

Mutex

One tool for thread control is the the Mutex! Which stands for Mutual Exclusion! When a thread takes the Mutex, it locks it, but only one thread can run with it at a time, all other thread will be suspended until the thread that ran with the mutex releases it.

But sometimes you don't want your thread to wait until the mutex is released, in that case, you can have them try to lock the mutex, and if they fail, they are free to do something else. Just don't let them run amok in the critical section.

Sometimes you have one thread generating values for multiple other threads to use. In this case the critical section when threads read the value is not mutually exclusive.

Semaphore

Another tool for thread control is the Semaphore. This useful when you have one thread producing values, and other thread consuming them. And, of course, you don't want thread to enter to consume values until they are reading.

What you do is have the consumer threads wait on the Semaphore, and then the producer thread can signal the semaphore to let a thread through. And the Semaphore will let as many threads through as the producer thread signals.

*Sadly we don't have much more in Godot. I'd wish for fancy reader-writer locks, or atomic increments, or interlocked operations, or...

Reordering

Why don't we just use a bool to indicate if the results are ready? Another thread could read it, and only access if it is true... Because shenanigans!

First of all, although Godot does not do this to GDScript, the C++ compiler used to compile Godot might reorder instructions as long as it is convinced the result is the same (and this is checked statically without considering other threads).

In general C++ compiler are free to compile the code however they want, as long as the result is as if the code did what the programmer wrote (this is the freedom that allows them to introduce optimizations, for example removing an instruction to write a variable that apparently nobody reads).

Second, even if the compiler does not reorder instruction, the CPU still might. While this is rare on the CPUs used in desktop computers, it is not so much on other architectures.

And third, CPUs have cache. In particular modern multi-core CPUs have cache per core. And they will read data from their cache, if it is there, instead of reading it from the RAM.

As a result, a thread might not see changes that another thread does right away, or in the same order. Thus, even if they see the bool set to true, they might still see stale values for other variables.

To prevent reordering, we use something called "memory barriers" (and similar mechanisms, which would be used to implement locks)… However, there is little point in going into that since we do not have means to making then from GDScript. We can only use Mutex and Semaphore and…

Thread Groups

I still need to tell you about newly fangled thread groups. If you set the thread group of a Node to PROCESS_THREAD_GROUP_MAIN_THREAD it will run in the main thread. If you set it to PROCESS_THREAD_GROUP_SUB_THREAD, it will run a new thread. And if you set it to PROCESS_THREAD_GROUP_INHERIT it will run in the same thread of the parent.

This allows you to define a set of Nodes that will operate on a separate thread, and as long as they only work among themselves, there won't be trouble, since they are all working on the same thread. And you can use a deferred call to give results to the main thread.

We also have a set of functions in Node that are meant to make the communication with sub threads safe (the methods with thread in the name in the Node class). These have memory barriers built-in.

Signals and awaiting

When you await a signal (which you can do from secondary threads, but please don't), the call returns. That is, the execution flow will exit the method. And it will give back and object storing the position where the execution flow left, so it can be resumed later, when the awaited signal is emitted.

Then when the signal is emitted, the main thread will get that object that represents the position of execution and go on from there! As a result, awaiting on secondary thread resulting in switching threads.

Note: this mechanism might change in future, as thread groups are still experimental and are likely to be expanded to cover this case.

On Thread Safety

Oh, thread safety! It means that something is safe to be called directly from any threads. Or put another way, it is written internally under the assumption that it might be called from multiple threads concurrently, and precautions are in place to prevent that to cause any unexpected behavior or crashes... In in simple words: the developers took care of the multi-threading, so you don't have to.

And, as the documentation points out, the scene tree is NOT thread safe. Which makes any operation the scene tree a critical section by default.

A note on physics

Another thing to consider is that Godot might use a separate thread for physics (which you enable from project settings). This will also apply to some calls your Node gets from physics.