volatile just doesn't fit any paradigm

Am I the only who thinks so? [Real question in a second.]

Besides the great confusion about it, and its mixing with mutexes or other locking mechanisms, when I am dealing with thread safety situation, I always have to discard any use of volatile, because it simply does not do anything useful.

volatile "forbids any reordering or caching of read/writes", but this is not of much use as long as volatile just tags a single object (and taints it, as it won't be any longer a "normal" object).

Consider this:

Thread A            Thread B
reads vars
                    locks mutex (gets access)
locks mutex (waits)
                    writes some vars
                    releases mutex
reads vars again
releases mutex

Now, the compiler might want to optimize the two thread A's reads saving some of the results in registers. You say I should declare those vars as volatile. I say that I don't want to flag everything as volatile, because volatile is transparent as a stone and I have to duplicate 50% of my code to support volatile types. You (he) say that locking a mutex (at least a POSIX mutex) is either something that a compiler recognizes, and manages correctly, either it's a call to a library (inaccessible to the compiler) that can change the world, so the compiler will not assume anything after such call. I say that this is too implementation dependent, very low level stuff (and I don't want to browse dev docs for day to day programming). And worse, it can suddenly change if for some reason the "external library", for any legit reason, becomes accessible to the compiler (maybe the author transforms the function in a template that has to be included in a header... whatever).

So, in my opinion, volatile is totally useless (and misleading), unless for some very low level stuff (device read maybe, but I'm not competent in such field).

Much better would be some other way to tell explicitly to the compiler that it has to discard any assumption about any variable, and every subsequent read must be a true read from memory. But I cannot imagine anything better than calling a dummy external function, and that would have the same issues that I have outlined before. Is there an elegant way to do so?

Solution

The new C standard, C11, has what you are calling for. First of all, it has a thread model and a detailed "happened before" relation that extends to threads. And it has a atomic_thread_fence operation that does what you are searching for.