Search code examples
cmultithreadingoperating-systemsimulationnios

Why does my producer - consumer pattern have unexpected output?


In this excercise (producer - consumer), 3 producers are one thread each and produce prime numbers. But when I run the program, the first prime number that is consumed is not the first that it produces, so I don't get the expected output. Can you please help me find and correct the bug?

Here's the main file with the pattern:

#include <stdio.h>
#include "oslab_lowlevel_h.h"

int NextPrime( int );

#define FIFO_SIZE 10

/* Declare a structure to hold a producer's starting value,
 * and an integer for the Producer-number (Producer 1, 2 or 3). */
struct Prod {
    int startvalue;
    int id;
};

unsigned int stack1[0x400]; /* Stack for thread 1 */
unsigned int stack2[0x400]; /* Stack for thread 2 */
unsigned int stack3[0x400]; /* Stack for thread 3 */
unsigned int stack4[0x400]; /* Stack for thread 4 */
unsigned int stack5[0x400]; /* Stack for thread 5 */

/* Declare variables for the First-In-First-Out Queue */
int  Fifo[FIFO_SIZE];        /* Array holding FIFO queue data. */
int  rdaddr;                /* Next unread entry when reading from queue. */
int  wraddr;                /* Next free entry when writing into queue. */

/* Declaration of semaphore variables.
 * 
 * Sorry for the lack of comments, but part of the purpose of the lab
 * is that you should find things out by reading the actual code. */
int  rdmutex = 1;
int  wrmutex = 1;
int  nrempty = FIFO_SIZE;
int  nrfull = 0;

/*
 * fatal_error
 * 
 * Print a message, then stop execution.
 * This function never returns; after printing
 * the message, it enters an infinite loop.
 */
void fatal_error( char * msg)
{
  printf( "\nFatal error: %s\n", msg );
  while( 1 );
}

/*
 * Sleep
 * 
 * Delay execution by keeping the CPU busy for a while,
 * counting down to zero.
 */
void Sleep (int n)
{
    while (n--);
}

/*
 * Signal
 * 
 * Semaphore operation: add to semaphore,
 * possibly allowing other threads to continue.
 */
void Signal( int *sem )
{
  /* We must disable interrupts, since the operation
   * *sem = *sem + 1
   * will require several machine instructions on Nios2.
   * If we have a timer-interrupt and a thread-switch
   * somewhere in the middle of those machine instructions,
   * the semaphore will be updated twice, or not at all, or
   * in some other erroneous way.
   */
  oslab_begin_critical_region();
  *sem = *sem + 1;
  oslab_end_critical_region();
}

/*
 * Wait
 * 
 * Sempahore operation: check semaphore, and
 * wait if the semaphore value is zero or less.
 */
void Wait( int *sem )
{
  /* Disable interrupts. */
  oslab_begin_critical_region();
  while ( *sem <= 0 )
    {
      /* If we should wait, enable interrupts again. */
      oslab_end_critical_region();

      //  oslab_yield(); /* Perhaps we should yield here? */

      /* Disable interrupts again before next iteration in loop. */
      oslab_begin_critical_region();
    }
    /* We have waited long enough - the semaphore-value is now
     * greater than zero. Decrease it. */
    *sem = *sem - 1;
    /* Enable interrupts again. */
    oslab_end_critical_region();
}

/*
 * PutFifo
 * 
 * Insert an integer into the FIFO queue.
 */
void PutFifo( int tal )
{
  //  Wait (&nrempty);      /* Wait for nrempty? */
  //  Wait (&wrmutex);      /* Wait for wrmutex? */

  Fifo[wraddr] = tal;       /* Write to FIFO array. */
    //  printf("\nPutFifo:  %d ", tal); /* Optional debug output */
    //  printf("\nwraddr = %d ", wraddr); /* Optional debug output. */
  wraddr = wraddr + 1;      /* Increase index into FIFO array,
                               to point to the next free position. */
  /* Wrap around the index, if it has reached the end of the array. */
  if (wraddr == FIFO_SIZE ) wraddr = 0;

  //  Signal (&wrmutex);    /* Signal wrmutex? */
  //  Signal (&nrfull);     /* Signal nrfull? */
}

/*
 * GetFifo
 * 
 * Extract the next integer from the FIFO queue.
 */
int GetFifo( void )
{
  int retval;               /* Declare temporary for return value. */

  //  Wait (&nrfull);       /* Wait for nrfull? */
  //  Wait (&rdmutex);      /* Wait for rdmutex? */

  retval = Fifo[rdaddr];    /* Get value from FIFO array. */
    //  printf("\nGetFifo:  %d ", retval); /* Optional debug output */
    //  printf("\nrdaddr = %d ", rdaddr); /* Optional debug output */
  rdaddr = rdaddr + 1;      /* Increase index into FIFO array,
                               to point to the next free position. */
  /* Wrap around the index, if it has reached the end of the array. */
  if (rdaddr == FIFO_SIZE ) rdaddr = 0;

  //  Signal (&rdmutex);    /* Signal rdmutex? */
  //  Signal (&nrempty);    /* Signal nrempty? */

  return (retval);          /* Return value fetched from FIFO. */
}

/*
 * NextPrime
 * 
 * Return the first prime number larger than the integer
 * given as a parameter. The integer must be positive.
 * 
 * *** NextPrime is outside the focus of this assignment. ***
 * The definition of NextPrime can be found at the end of this file.
 * The short declaration here is required by the compiler.
 */
int NextPrime( int );

void Producer( struct Prod * prodstruct )
{
  int next;                 /* Will hold the prime we just produced. */
  int prodid;               /* Tells whether we are producer 1, 2 or 3. */
  next = prodstruct -> startvalue; /* Get starting value from parameter. */
  prodid = prodstruct -> id;/* Get producer number from parameter. */
  while( 1 )                /* Loop forever. */
  {
    next = NextPrime (next);/* Produce a new prime. */
    printf("\nNext Prime from producer %d is %d",prodid,next); /* Informational output. */
    PutFifo(next);          /* Write prime into FIFO. */
  //  oslab_yield();        /* Perhaps we should yield here? */ 
  }
}

void Consumer( int * tal )
{
  int next;                 /* Will hold the prime we are to consume. */
  int consid = *tal;        /* Tells whether we are consumer 1 or 2. */
  while( 1 )                /* Loop forever. */
  {
    next = GetFifo();       /* Get a newly produced prime from the FIFO. */
    printf("\nConsumer %d gets Prime %d ",consid, next); /* Informational output. */
    Sleep(2000);            /* Symbolic work. */
  //  oslab_yield();        /* Perhaps we should yield here? */ 
  }
}

int main( void )
{
  int new_thread_id; /* Thread ID variable. */
  struct Prod prod1, prod2, prod3;  /* Producer starting-values. */
  int cons1, cons2;                 /* Consumer starting-values. */

  rdaddr = 0;               /* FIFO initialization. */
  wraddr = 0;               /* FIFO initialization. */
  printf("\nSystem starting...");

  prod1.startvalue = 2000;
  prod1.id = 1;

  prod2.startvalue = 5000;
  prod2.id = 2;

  prod3.startvalue = 8000;
  prod3.id = 3;

  cons1 = 1;
  cons2 = 2;

  new_thread_id = oslab_create_thread((void *)Producer, &prod1, &(stack1[0x3ff]));
  if( new_thread_id < 0 ) fatal_error( "cannot start Producer 1" );
  printf("\nProducer %d is created with thread-ID %d", prod1.id, new_thread_id);

  new_thread_id = oslab_create_thread((void *)Producer, &prod2, &(stack2[0x3ff]));
  if( new_thread_id < 0 ) fatal_error( "cannot start Producer 2" );
  printf("\nProducer %d is created with thread-ID %d", prod2.id, new_thread_id);

  new_thread_id = oslab_create_thread((void *)Producer, &prod3, &(stack3[0x3ff]));
  if( new_thread_id < 0 ) fatal_error( "cannot start Producer 3" );
  printf("\nProducer %d is created with thread-ID %d", prod3.id, new_thread_id);

  new_thread_id = oslab_create_thread((void *)Consumer, &cons1, &(stack4[0x3ff]));
  if( new_thread_id < 0 ) fatal_error( "cannot start Consumer 1" );
  printf("\nConsumer %d is created with thread-ID %d", cons1, new_thread_id);

  new_thread_id = oslab_create_thread((void *)Consumer, &cons2, &(stack5[0x3ff]));
  if( new_thread_id < 0 ) fatal_error( "cannot start Consumer 2" );
  printf("\nConsumer %d is created with thread-ID %d", cons2, new_thread_id);

  oslab_idle(); /* Must be called here! */
}


/*
 * NextPrime
 * 
 * Return the first prime number larger than the integer
 * given as a parameter. The integer must be positive.
 */
#define PRIME_FALSE   0     /* Constant to help readability. */
#define PRIME_TRUE    1     /* Constant to help readability. */
int NextPrime( int inval )
{
   int perhapsprime;        /* Holds a tentative prime while we check it. */
   int testfactor;          /* Holds various factors for which we test perhapsprime. */
   int found;               /* Flag, false until we find a prime. */

   if (inval < 3 )          /* Initial sanity check of parameter. */
   {
     if(inval <= 0) return(1);  /* Return 1 for zero or negative input. */
     if(inval == 1) return(2);  /* Easy special case. */
     if(inval == 2) return(3);  /* Easy special case. */
   }
   else
   {
     /* Testing an even number for primeness is pointless, since
      * all even numbers are divisible by 2. Therefore, we make sure
      * that perhapsprime is larger than the parameter, and odd. */
     perhapsprime = ( inval + 1 ) | 1 ;
   }
   /* While prime not found, loop. */
   for( found = PRIME_FALSE; found != PRIME_TRUE; perhapsprime += 2 )
   {
     /* Check factors from 3 up to perhapsprime/2. */
     for( testfactor = 3; testfactor <= (perhapsprime >> 1) + 1; testfactor += 1 )
     {
       found = PRIME_TRUE;      /* Assume we will find a prime. */
       if( (perhapsprime % testfactor) == 0 ) /* If testfactor divides perhapsprime... */
       {
         found = PRIME_FALSE;   /* ...then, perhapsprime was non-prime. */
         goto check_next_prime; /* Break the inner loop, go test a new perhapsprime. */
       }
     }
     check_next_prime:;         /* This label is used to break the inner loop. */
     if( found == PRIME_TRUE )  /* If the loop ended normally, we found a prime. */
     {
       return( perhapsprime );  /* Return the prime we found. */
     } 
   }
   return( perhapsprime );      /* When the loop ends, perhapsprime is a real prime. */
} 

The rest of the files are available here.

When I run the code I get the expected output from the producers but I don't get the expected output for the consumers:

System starting...
Producer 1 is created with thread-ID 1
Producer 2 is created with thread-ID 2
Producer 3 is created with thread-ID 3
Consumer 1 is created with thread-ID 4
Consumer 2 is created with thread-ID 5
#### Thread yielded after using 1 tick.
Performing thread-switch number 1. The system has been running for 1 ticks.
Switching from thread-ID 0 to thread-ID 1.

Next Prime from producer 1 is 2003
PutFifo:  2003 
wraddr = 0 
Next Prime from producer 1 is 2011
PutFifo:  2011 
wraddr = 1 
Next Prime from producer 1 is 2017
PutFifo:  2017 
wraddr = 2 
Next Prime from producer 1 is 2027
PutFifo:  2027 
wraddr = 3 
Next Prime from producer 1 is 2029
PutFifo:  2029 
wraddr = 4 
Next Prime from producer 1 is 2039
PutFifo:  2039 
wraddr = 5 
Next Prime from producer 1 is 2053
PutFifo:  2053 
wraddr = 6 
Next Prime from producer 1 is 2063
PutFifo:  2063 
wraddr = 7 
Next Prime from producer 1 is 2069
PutFifo:  2069 
wraddr = 8 
Next Prime from producer 1 is 2081
PutFifo:  2081 
wraddr = 9 
Next Prime from producer 1 is 2083
PutFifo:  2083 
wraddr = 0 
Next Prime from producer 1 is 2087
PutFifo:  2087 
wraddr = 1 
Next Prime from producer 1 is 2089
PutFifo:  2089 
wraddr = 2 
Next Prime from producer 1 is 2099
PutFifo:  2099 
wraddr = 3 
Next Prime from producer 1 is 2111
PutFifo:  2111 
wraddr = 4 
Next Prime from producer 1 is 2113
PutFifo:  2113 
wraddr = 5 
Next Prime from producer 1 is 2129
PutFifo:  2129 
wraddr = 6 
Next Prime from producer 1 is 2131
PutFifo:  2131 
wraddr = 7 
Next Prime from producer 1 is 2137
PutFifo:  2137 
wraddr = 8 
Next Prime from producer 1 is 2141
PutFifo:  2141 
wraddr = 9 
Next Prime from producer 1 is 2143
PutFifo:  2143 
wraddr = 0 
Next Prime from producer 1 is 2153
PutFifo:  2153 
wraddr = 1 
Performing thread-switch number 2. The system has been running for 101 ticks.
Switching from thread-ID 1 to thread-ID 2.

Next Prime from producer 2 is 5003
PutFifo:  5003 
wraddr = 2 
Next Prime from producer 2 is 5009
PutFifo:  5009 
wraddr = 3 
Next Prime from producer 2 is 5011
PutFifo:  5011 
wraddr = 4 
Next Prime from producer 2 is 5021
PutFifo:  5021 
wraddr = 5 
Next Prime from producer 2 is 5023
PutFifo:  5023 
wraddr = 6 
Next Prime from producer 2 is 5039
PutFifo:  5039 
wraddr = 7 
Next Prime from producer 2 is 5051
PutFifo:  5051 
wraddr = 8 
Next Prime from producer 2 is 5059
PutFifo:  5059 
wraddr = 9 
Next Prime from producer 2 is 5077
PutFifo:  5077 
wraddr = 0 
Next Prime from producer 2 is 5081
PutFifo:  5081 
wraddr = 1 
Performing thread-switch number 3. The system has been running for 201 ticks.
Switching from thread-ID 2 to thread-ID 3.

Next Prime from producer 3 is 8009
PutFifo:  8009 
wraddr = 2 
Next Prime from producer 3 is 8011
PutFifo:  8011 
wraddr = 3 
Next Prime from producer 3 is 8017
PutFifo:  8017 
wraddr = 4 
Next Prime from producer 3 is 8039
PutFifo:  8039 
wraddr = 5 
Next Prime from producer 3 is 8053
PutFifo:  8053 
wraddr = 6 
Next Prime from producer 3 is 8059
PutFifo:  8059 
wraddr = 7 
Performing thread-switch number 4. The system has been running for 301 ticks.
Switching from thread-ID 3 to thread-ID 4.

GetFifo:  5077 
rdaddr = 0 
Consumer 1 gets Prime 5077 
GetFifo:  5081 
rdaddr = 1 
Consumer 1 gets Prime 5081 
GetFifo:  8009 
rdaddr = 2 
Consumer 1 gets Prime 8009 
GetFifo:  8011 
rdaddr = 3 
Consumer 1 gets Prime 8011 
GetFifo:  8017 
rdaddr = 4 
Consumer 1 gets Prime 8017 
GetFifo:  8039 
rdaddr = 5 
Consumer 1 gets Prime 8039 
GetFifo:  8053 
rdaddr = 6 
Consumer 1 gets Prime 8053 
GetFifo:  8059 
rdaddr = 7 
Consumer 1 gets Prime 8059 
GetFifo:  5051 
rdaddr = 8 
Consumer 1 gets Prime 5051 
GetFifo:  5059 
rdaddr = 9 
Consumer 1 gets Prime 5059 
GetFifo:  5077 
rdaddr = 0 
Consumer 1 gets Prime 5077 
GetFifo:  5081 

Could you please tell me why the first 30 primes get overwritten when all I do is just following the specification and just removing the comments from the code to activate what the instructor has prepared for us to study? I've been unable to complete this exercise for months since I'm not getting any good help. The first 30 primes are myseriously overwritten and the program should not be changed (it is homework). I asked the instructor and he didn't, said that I'm using newer versions of the software. I could try with older versions of the softare but that doesn't seem like a likely solution.

Update

The strategy I can think of is start using the debugger and inspect the FIFO ADT during execution. I don't have much experience using gdb so please help me if you can.


Solution

  • This code can't possibly be expected to work correctly if you don't uncomment the Wait and Signal calls inside the PutFifo and GetFifo functions. If it does appear to work under certain circumstances on a certain computer, that is pure luck.

    First off, if one or more of the producer threads fill up the FIFO buffer before switching to one of the consumer threads, then some of that data is obviously going to be lost.

    You can clearly see this in your example output. The producer threads have generated 38 values before the first consumer thread gets a chance to run. And because your buffer size is 10, the first value the consumer is going to read is actually the 31st value produced (i.e. the last value written to wraddr 0).

    ╔═══════╦════════╦═══════╗
    ║ count ║ wraddr ║ value ║
    ╠═══════╬════════╬═══════╣
    ║    .. ║     .. ║    .. ║
    ║    29 ║      8 ║  5051 ║
    ║    30 ║      9 ║  5059 ║
    ║    31 ║      0 ║  5077 ║ <- Consumer starts here
    ║    32 ║      1 ║  5081 ║
    ║    33 ║      2 ║  8009 ║
    ║    34 ║      3 ║  8011 ║
    ║    35 ║      4 ║  8017 ║
    ║    36 ║      5 ║  8039 ║
    ║    37 ║      6 ║  8053 ║
    ║    38 ║      7 ║  8059 ║
    ╚═══════╩════════╩═══════╝
    

    Morever, if the consumer threads read more data than is available in the FIFO buffer before switching back to one of the producer threads, then they are going to end up reading the same values more than once. Again you can see this from your example output. The consumer thread reads items 31 to 38, then wraps around to items 29 and 30 (the last values at wraddr 8 and 9), before repeating item 31 again.

    This isn't the worst thing that could happen though. In a true preemptive multithreading system, a producer thread could be preempted halfway through the PutFifo function. So imagine one of the producer threads is writing to the FIFO buffer when wraddr is 9. Let's say it executes these two lines.

    Fifo[wraddr] = tal;       /* Write to FIFO array. */
    wraddr = wraddr + 1;      /* Increase index into FIFO array,
    

    At this point wraddr is 10, but before the function has a chance to check for the overflow (and wrap the index back to 0), the thread gets preempted by another one of the producer threads. And since wraddr is 10, this new producer is going to be writing past the end of the buffer, potentially causing the app to crash.

    If it survives, wraddr will be incremented again (becoming 11), but it still won't wrap to zero, because the overflow check is expecting an exact match with the FIFO_SIZE. So even if it doesn't crash immediately, it's sure to crash at some point, because wraddr will just continue getting bigger and bigger, overwriting more and more memory.

    The bottom line is if you want this code to work you are going to have to add the synchronisation calls.