Search code examples
clinuxgccx86setjmp

Performance overhead of using volatile for setjmp/longjmp


For setjmp/longjmp to work, you need to declare local variables as volatile. If someone is compiling its code with -O3, how much would be the impact of volatile variables on performance . Would it be huge or only a tiny bit on an x86 multicore platform?

In my opinion, it would only add tiny bit of overhead, because that volatile variable could still be cached and reading/writing from cache is quite fast anyway. Opinions?


Solution

  • As a quick aside, the semantics of volatile all depend on the platform/compiler. On some compilers like MSVC with IA64 architecture, the volatile keyword not only prevents the compiler from re-ordering operations, it also performs each read/write operation with acquire/release semantics, meaning there is a memory-barrier operation in effect. GCC on the other-hand only prevents the compiler from re-ordering operations before/after the read/write to the volatile memory location ... on platforms with weak-memory models, the acquire-release semantics are not maintained like they are with MSVC.

    Now on x86, because of its strict-ordering memory model, the presence of memory barriers from the use of the volatile keyword aren't an issue, so the main penalty will simply be the lack of re-ordering and other optimizations that can be performed by the compiler. That being said, it will then depend on what your code looks like. For instance, if you have a tight-loop in your code, and certain volatile-qualified variables are actually loop-invariants, you're not going to get some of the optimizations that the compiler could do if those memory locations were qualified as non-volatile.