c++multithreading producer-consumer condition-variable

Why does using separate lock_guard for cv wait and pop() cause a segmentation fault?

I created a Packets class to synchronize data processing between consumers and producers. My design uses separate locks for the condition variable (cv wait) and the operation of popping packets from the queue.

Here’s the implementation of the Packets class:

class Packets
{
public:
    void finish()
    {
        {
            std::lock_guard lock(mMutex);
            mFinished = true;
        }
        mConditionVariable.notify_all();
    }
 
    void add(const Packet& packet) {
        {
            std::lock_guard lock(mMutex);
            mPackets.push(packet);
        }
        mConditionVariable.notify_one();
    }
 
    Packet pop()
    {
        std::lock_guard lock(mMutex);
        Packet packet = mPackets.front();
        mPackets.pop();
        return packet;
    }
 
    bool isFinished() const
    {
        return mFinished;
    }
 
    void waitForReady()
    {
        std::unique_lock lock(mMutex);
        mConditionVariable.wait(lock, [this] { return !mPackets.empty() || isFinished(); });
    }
 
private:
    std::queue<Packet> mPackets;
    std::mutex mMutex;
    std::condition_variable mConditionVariable;
    bool mFinished = false;
};

Here are the consumer and producer functions that I am using:

std::mutex pMutex;
 
void produce(Packets& packets)
{
    while(true) {
        std::this_thread::sleep_for(std::chrono::milliseconds(5000));
        Packet packet = generate_random_packet();
        packets.add(packet);
 
        std::lock_guard lock(pMutex);
        std::cout << "Packet created " << packet.source << " -> " << packet.destination << ": " << packet.data << std::endl;
    }
}
 
void consume(Packets& packets)
{
    while(true) {
        packets.waitForReady();
        if (packets.isFinished()) {
            std::cout << "Consume finished" << std::endl;
            return;
        }
        auto packet = packets.pop();
 
        std::lock_guard lock(pMutex);
        std::cout << "Received packet " << packet.source << " -> " << packet.destination << ": " << packet.data << std::endl;
    }
}

When I run this code, I sometimes experience an issue where all threads wake up and try to call pop() on the queue, but there aren't enough elements in the queue. This causes a segmentation fault. Here’s the output when this happens:

Packed created 17 -> 1: eQZm
Packed created 39 -> 36: FGTZ
Received packet 10 -> 54: YplK
Received packet 24 -> 50: aUNi
Received packet 0 -> 0: 
Received packet 0 -> 0: 
Received packet 0 -> 0: 
Packed created 99 -> 2: XCcT
Received packet 17 -> 1: eQZm
Received packet 39 -> 36: FGTZ
Segmentation fault (core dumped)

However, when I modify the code to pop the element within the same lock as the condition variable wait, it works correctly:

Packet get()
{
    std::unique_lock lock(mMutex);
    mConditionVariable.wait(lock, [this] { return !mPackets.empty() || isFinished(); });
    Packet packet = mPackets.front();
    mPackets.pop();
    return packet;
}

Question: Why does using separate locks (one for cv wait and one for popping packets) lead to a segmentation fault, whereas using the same lock for both operations works as expected?

What I tried: I implemented a Packets class to synchronize producer and consumer threads, using separate locks for the condition variable (cv wait) and popping packets from the queue.

What I expected: I expected the consumer to wait for packets when the queue was empty, and the producer to add packets without issues, with separate locks not causing problems.

What actually happened: I encountered a segmentation fault when multiple consumers woke up and tried to pop from an empty queue. The issue only occurred with separate locks for cv wait and popping packets. Combining them into one lock worked correctly.

Solution

this is a typical time-of-check-time-of-use bug.

the lock protects the class state. once you release the lock this state is no-longer valid.

// two threads arrive here
packets.waitForReady(); // time of check, both succeed
// both threads arrive here, only 1 item in queue
auto packet = packets.pop();  // time of use, first succeeds, second segfaults

condition_variable.wait() holds the lock when checking its condition and keeps the lock held when it returns, and the invariants still hold for its scope, so the following code is correct.

std::unique_lock lock(mMutex);
mConditionVariable.wait(lock, [this] { return !mPackets.empty() || isFinished(); });
// reaches here when there is an item in the queue, and lock is held
// if (isFinished()) { throw ...; } // or return std::optional
Packet packet = mPackets.front(); // lock still held, invariant holds!
mPackets.pop();

The idea of do-then-check is typically done by a lot of APIs that need to work in multithreaded code, for example operator new throws when the allocation fails, and std::weak_ptr.lock() returns a nullptr on failure, and istream needs to be checked for failue after every operation. you cannot reliably predict ahead of time whether an operation will succeed or not when more than one thread is involved.