I want to improve the performance of a program by replacing some of the mutexes with spinlocks. I have found a spinlock implementation in
which I intend to reuse. I believe this implementation is safer than simpler implementations in which threads keep trying forever like the one found here
But i need to clarify some things on the yield function found here
First of all I can assume that the numbers 4,16,32 are arbitrary. I actually tested some other values and I have found that I got best performance in my case by using other values.
But can someone explain the reasoning behind the yield code. Specifically why do we need all three
BOOST_SMT_PAUSE
sched_yield
and nanosleep
Yes, this concept is known as "adaptive spinlock" - see e.g. https://lwn.net/Articles/271817/.
Usually the numbers are chosen for exponential back-off: https://geidav.wordpress.com/tag/exponential-back-off/
So, the numbers aren't arbitrary. However, which "numbers" work for your case depend on your application patterns, requirements and system resources.
The three methods to introduce "micro-delays" are designed explicitly to balance the cost and the potential gain:
yield
might allow the OS to avoid a context switch depending on other system load (e.g. if the number of threads < number logical cores)The trade-offs with these are important for low-latency applications where the effect of a context switch or cache misses are significant.
All trade-offs try to find a balance between wasting CPU cycles and losing cache/thread efficiency.