Why does pthread_cond_wait have spurious wakeups?

To quote the man page:

When using condition variables there is always a Boolean predicate involving shared variables associated with each condition wait that is true if the thread should proceed. Spurious wakeups from the pthread_cond_timedwait() or pthread_cond_wait() functions may occur. Since the return from pthread_cond_timedwait() or pthread_cond_wait() does not imply anything about the value of this predicate, the predicate should be re-evaluated upon such return.

So, pthread_cond_wait can return even if you haven't signaled it. At first glance at least, that seems pretty atrocious. It would be like a function which randomly returned the wrong value or randomly returned before it actually reached a proper return statement. It seems like a major bug. But the fact that they chose to document this in the man page rather than fix it would seem to indicate that there is a legitimate reason why pthread_cond_wait ends up waking up spuriously. Presumably, there's something intrinsic about how it works that makes it so that that can't be helped. The question is what.

Why does pthread_cond_wait return spuriously? Why can't it guarantee that it's only going to wake up when it's been properly signaled? Can anyone explain the reason for its spurious behavior?

Solution

The following explanation is given by David R. Butenhof in "Programming with POSIX Threads" (p. 80):

Spurious wakeups may sound strange, but on some multiprocessor systems, making condition wakeup completely predictable might substantially slow all condition variable operations.

In the following comp.programming.threads discussion, he expands on the thinking behind the design:

Patrick Doyle wrote: 
> In article , Tom Payne   wrote: 
> >Kaz Kylheku  wrote: 
> >: It is so because implementations can sometimes not avoid inserting 
> >: these spurious wakeups; it might be costly to prevent them. 

> >But why?  Why is this so difficult?  For example, are we talking about 
> >situations where a wait times out just as a signal arrives? 

> You know, I wonder if the designers of pthreads used logic like this: 
> users of condition variables have to check the condition on exit anyway, 
> so we will not be placing any additional burden on them if we allow 
> spurious wakeups; and since it is conceivable that allowing spurious 
> wakeups could make an implementation faster, it can only help if we 
> allow them. 

> They may not have had any particular implementation in mind. 

You're actually not far off at all, except you didn't push it far enough. 

The intent was to force correct/robust code by requiring predicate loops. This was 
driven by the provably correct academic contingent among the "core threadies" in 
the working group, though I don't think anyone really disagreed with the intent 
once they understood what it meant. 

We followed that intent with several levels of justification. The first was that 
"religiously" using a loop protects the application against its own imperfect 
coding practices. The second was that it wasn't difficult to abstractly imagine 
machines and implementation code that could exploit this requirement to improve 
the performance of average condition wait operations through optimizing the 
synchronization mechanisms. 
/------------------[ [email protected] ]------------------\ 
| Compaq Computer Corporation              POSIX Thread Architect | 
|     My book: http://www.awl.com/cseng/titles/0-201-63392-2/     | 
\-----[ http://home.earthlink.net/~anneart/family/dave.html ]-----/