Search code examples
cmultithreadingpthreadsproducer-consumer

Consumer won't start until array is full


Like the title says, I've got a problem with the consumer thread waiting until the entire array is filled until it starts consuming, then the producer wait until it's empty again, and round they go in a circle until they're finished with their loops. I have no idea why they're doing that. Be gentle, as this is a new topic for me and I'm trying to understand mutexes and conditionals.

#include <stdio.h>
#include <pthread.h>

#define BUFFER_SIZE 6
#define LOOPS 40

char buff[BUFFER_SIZE];
pthread_mutex_t buffLock=PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t emptyCond=PTHREAD_COND_INITIALIZER, fullCond=PTHREAD_COND_INITIALIZER;
int buffIndex=0;

void* Producer(){
 int i=0;
 for(i=0;i<LOOPS;i++){
    pthread_mutex_lock(&buffLock);
    while(buffIndex==BUFFER_SIZE)
        pthread_cond_wait(&fullCond, &buffLock);

    buff[buffIndex++]=i;
    printf("Producer made: %d\n", i);
    pthread_mutex_unlock(&buffLock);
    pthread_cond_signal(&emptyCond);
 }

pthread_exit(0);
}

void* Consumer(){
 int j=0, value=0;
 for(j=0;j<LOOPS;j++){
    pthread_mutex_lock(&buffLock);
    while(buffIndex==0)
        pthread_cond_wait(&emptyCond, &buffLock);

 value=buff[--buffIndex];
 printf("Consumer used: %d\n", value);
 pthread_mutex_unlock(&buffLock);
 pthread_cond_signal(&fullCond);
 }

pthread_exit(0);
}

int main(){
 pthread_t prodThread, consThread;

 pthread_create(&prodThread, NULL, Producer, NULL);
 pthread_create(&consThread, NULL, Consumer, NULL);

 pthread_join(prodThread, NULL);
 printf("Producer finished.\n");
 pthread_join(consThread, NULL);
 printf("Consumer finished.\n");

 return 0;

}

Solution

  • The thesis that the producer and consumer threads alternate running BUFFER_SIZE times is incorrect. The program here exhibits nondeterminism, so one of many orderings are possible between producer and consumer. The way the program is written only guarantees two things:

    1. If the consumer gets the lock when the buffer is empty, it will relinquish the lock and wait to be signaled by the producer.
    2. If the producer gets the lock when the buffer is full, it will relinquish the lock and wait to be signaled by the consumer.

    As a consequence of these two properties, it's guaranteed that the producer will always emit first and the consumer will always be the last of the two threads to print. It also guarantees that neither thread can successfully claim the lock more than BUFFER_SIZE times in a row.

    Beyond the above two guarantees, an actual run will yield completely nodeterministic results. It just so happens that your operating system has arbitrarily decided to re-schedule the latest thread repeatedly in your observed runs. This is reasonable because your program has said to the scheduler "do whatever you want within the ordering constraints of above two rules". The OS is free to bias towards scheduling the same thread to run on the CPU again if it wants; in fact, this probably makes the most sense as the thread that just ran on the CPU has its resources already loaded (locality of reference), so overhead from context switches is reduced.

    While the OS may schedule the same thread repeatedly, it's also possible that, for example, the entire process is descheduled and a higher-priority process is run, potentially evicting the working set of the first process. In this scenario, when the first process is re-scheduled, any thread may have just as good of a shot of being scheduled.

    Whatever the case, whenever nondeterminism is permitted by the programmer, ordering is out of their hands and may not appear to be random for a variety of complex reasons.

    As I mentioned in the comments, it's important to be able to convince yourself of the program's ordering properties without running the program. Running the program can prove that nondeterminism exists, but it cannot prove the program to be deterministic. Some multithreaded programs contain subtle scheduling bugs or race conditions that may only arise once in a trillion (or more!) runs, so there's no way to hand-check such nondeterministic programs. Luckily, this one is trivial, so it's easy to simply run it until the nondeterminism appears.

    A useful tool for debugging multithreaded programs is sleep(1) in unistd.h. This function causes the calling thread to be descheduled, perturbing the natural ordering of the program and forcing a certain ordering. This can help you prove ordering properties. For example, adding a sleep(1) after pthread_cond_signal(&emptyCond); shows that given the opportunity, the consumer will grab the lock before BUFFER_SIZE productions occurred in your program.

    For complex programs, tools like Cuzz exist to programmatically insert sleep calls to uncover ordering bugs. See testing approach for multi-threaded software for a variety of resources and strategies.