Search code examples
c#multithreadingproducer-consumermissing-data

C# Multi-Producer/Multi-Tiered Multi-Consumer Losing Data


I have a built complex application using a multi-tiered producer-consumer pattern, with multiple consumers performing specialized tasks before enqueing data to the next group of consumers. The ultimate job of the application is to break down a raw data file into test records for individual units that that will have been normalized.

The base of the P-C pattern uses Dustin Hyun's pattern from http://dustin-hyun.blogspot.com/2013_07_01_archive.html. I have made numerous modifications because of the multiple tiered approach and others. The code is too complex to post here- perhaps I could post snippets upon request to help clarify and answer questions.

I have employed two tools to speed up how a file gets processed. First is multiple instances of any of the tiers of consumer- there could be eight "index" consumers running whose jobs are to convert the test data from unit IDs and Test Names to Unit Indices and Test Name Indices to normalize the results to load into the DB. Second is the Bundling of units into merged DataTables at two point in the operation.

I have identified that data is lost intermittently, but in a fairly predictable pattern. It appears to be the last, incomplete bundle where the data was expected to have been. After the standard loop pattern, I have a check for a boolean that I use to flag if there is an incoMplete bundle, and it works:

if (dataToSend)  // Check if incomplete bundle to process & send prior to ending comsumer operation.
        {
            UpdateLimitsIndices(bundleNlu);
            Enqueue(StdfQType.Func, new BundledNamedTables((N_ParamRes)bundlePR.Copy(), (N_FuncRes)bundleFR.Copy(), numUnitsInCurrBundle));
        }

I also have put locks onto everyplace I can see where the any of the p_c entities read or write anything from any of the shared queue members. With just the locks, there appeared to be no real impact. On a whim, I started to play with the sleep time before the loop re-spins So far, Test conditions that caused data loss with a 1ms sleep did not cause data loss during a 100 ms sleep or even a 10 ms sleep during limited testing. Could it be that the longer sleep is allowing the last piece/bundle of data to be properly processed?

I recognize that this question is vague and has few specifics because the application is too complex to post. I do hope I gave enough information for a dialog to start, however. I look for eard to heading your thoughts.

Jeff


Solution

  • I would suggest that because you are not using thread-safe collections (and neither does the author that you are basing your code on) that this may be the basis for losing data due to a concurrent write operation that fails (silently).

    Luckily, along with the Task Parallel Library (TPL) .NET 4.0 gives us a whole bunch of concurrent collections which ARE thread-safe for multi-threaded environments.

    Have a look at the collections in System.Collections.Concurrent as they are all thread-safe and their locking mechanisms are a lot faster than traditional lock-based objects.