I'm just recently learning about pthread condition variables, which appears to be fundamental to this question.
I'm observing what appears to be a thread "breaking through" and acquiring a mutex owned by another thread!
This is blowing the very fundamentals of my understanding of mutex ownership, and I'm at a loss how to explain this:
In the following code, I have class ScopeLock
, a fairly common C++ wrapper over a mutex that acquires the mutex in its ctor and releases it in its dtor.
From main()
, I spawn two threads, each of which attempt to acquire a common mutex. Because there is a healthy sleep between the two threads' creation, it is expected that the first spawned thread will acquire the mutex.
In thread 1, I do a pthread_cond_wait()
and never signal the condition variable, the intent being to block forever.
The intent is that, since thread 1 acquires the mutex and blocks forever, thread 2 will also block forever when it attempts to acquire the mutex.
Code:
// main.cpp
#include <iostream>
#include <pthread.h>
#include <unistd.h>
class ScopeLock
{
public:
ScopeLock( pthread_mutex_t& mutex ) : mutex_( mutex )
{
pthread_mutex_lock( &mutex );
}
~ScopeLock()
{
pthread_mutex_unlock( &mutex_ );
}
private:
pthread_mutex_t mutex_;
};
pthread_mutex_t g_mutex;
pthread_cond_t g_cond;
void* func1( void* arg )
{
std::cout << "locking g_mutex from " << pthread_self() << std::endl;
ScopeLock lock( g_mutex );
std::cout << "locked g_mutex from " << pthread_self() << std::endl;
std::cout << __FUNCTION__ << " before cond_wait()" << std::endl;
pthread_cond_wait( &g_cond, &g_mutex );
//sleep( 1000 );
std::cout << __FUNCTION__ << " after cond_wait()" << std::endl;
return NULL;
}
void* func2( void* arg )
{
std::cout << "locking g_mutex from " << pthread_self() << std::endl;
ScopeLock lock( g_mutex );
std::cout << "locked g_mutex from " << pthread_self() << std::endl;
std::cout << __FUNCTION__ << std::endl;
return NULL;
}
int main( int argc, char* argv[] )
{
pthread_t t1;
pthread_t t2;
pthread_mutex_init( &g_mutex, NULL );
pthread_cond_init( &g_cond, NULL );
pthread_create( &t1, NULL, func1, NULL );
sleep ( 2 );
pthread_create( &t2, NULL, func2, NULL );
pthread_join( t2, NULL );
std::cout << "joined t2" << std::endl;
pthread_join( t1, NULL );
std::cout << "joined t1" << std::endl;
return 0;
}
Compilation/output:
>g++ --version
g++ (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>g++ -g main.cpp -lpthread && ./a.out
locking g_mutex from 139707808458496
locked g_mutex from 139707808458496
func1 before cond_wait()
locking g_mutex from 139707800065792 // <-- Here onward is output 2 sec later
locked g_mutex from 139707800065792
func2
joined t2
But the output of the executable shows thread 2 advancing past the mutex acquisition! Can anyone please explain why this happens?
You can see I attempted to sanity-check the situation with the "sleep( 1000 )
": if I comment-out the pthread_cond_wait()
and uncomment the sleep()
, then the executable behavior aligns with my expectation, which is that thread 2 does not advance beyond the "locking mutex..."
statement in func2()
.
So I surmise that the "unexpected" behavior of this application is because of the pthread_cond_wait()
, but I guess I fundamentally don't understand why: why can thread 2 advance beyond the mutex acquisition? My expectation was that thread 1, having acquired the mutex, and waiting on a condition variable that is never signaled would have blocked thread 2 from acquiring the mutex - why is this not so?
Grateful for help and explanation from the community.
Edit:
I'm starting to form the inkling of an idea...I remember something about pthread_cond_wait()
unlocking its mutex while it waits...so I wonder if it's "undoing" the ScopeLock's intended mutex-hold...? I'm don't have a proper/fully-formed idea, though, so I could still use a comprehensive answer from knowledgeable users.
The intent is that, since thread 1 acquires the mutex and blocks forever, thread 2 will also block forever when it attempts to acquire the mutex.
From the documentation:
These functions atomically release mutex and cause the calling thread to block on the condition variable cond;
Therefore, thread 1 releases the mutex, which thread2 happily uses.
That's okay though because pthread_cond_wait
re-acquires the mutex before returning, which makes your use perfectly fine:
Upon successful return, the mutex shall have been locked and shall be owned by the calling thread.
This question might be of interest to understand why it works that way: Why do pthreads’ condition variable functions require a mutex?