Search code examples
c#concurrencyproducer-consumer

What's the most efficient way to handle infinite tasks in Producer/Consumer?


I have Gigabytes of data (stored in messages, each message is about 500KB) in a cloud queue (Azure) and data keeps coming.

I need to do some processing on each message. I've decided to create 2 background workers, one to get the data into memory, and the other to process that data:

GetMessage(CloudQueue cloudQueue, LocalQueue localQueue)
{
    lock (localQueue)
    {
        localQueue.Enqueue(cloudQueue.Dequeue());
    }
}

ProcessMessage(LocalQueue localQueue)
{
    lock (localQueue)
    {
        Process(localQueue.Dequeue());
    }
}

The issue is that data never stops coming so I'll be spending ALOT of time on synchronizing the local queue. Is there a known pattern for this type of problem?


Solution

  • You don't need to hold the lock while you process

    Item i;
    lock (localQueue)
    {
        i = localQueue.Dequeue();
    }
    Process(i);
    

    Hence there should be little contention. If necessary, reduce the frequency with which the Producer takes the lock for enqueuing by batching the insertions: rather than the queue hold individual items have it hold batches. You effectively reduce the number of locks by a factor which is the average batch size. You can have a simple model of batching, say, every 10 or by time or some combination of time and threshold.