Search code examples
c#task-parallel-librarylocalecultureinfo

Parallel.for causes different results


I am currently trying to improve a C# project I am working on. Specifically, my goal is to parallelize some operations to reduce processing time. I am starting with small snippets just to get the hang of it. The following code (not parallel) works correctly (as expected)

for (int i = 0; i < M; i++)
{
     double d;
     try
     {
          d = Double.Parse(lData[i]);
     }
     catch (Exception)
     {
         throw new Exception("Wrong formatting on data number " + (i + 1) + " on line " + (lCount + 1));
     }
     sg[lCount % N][i] = d;
}

By using the following (parallel) code I would expect to obtain the exact same results, but that is not the case.

Parallel.For(0, M, i =>
{
    double d;
    try
    {
        d = Double.Parse(lData[i]);
    }
    catch (Exception)
    {
        throw new Exception("Wrong formatting on data number " + (i + 1) + " on line " + (lCount + 1));
    }
    sg[lCount % N][i] = d;
});

The part of the program these snippets are from reads data from a file, one line at a time. Each line is a sequence of comma-separated double precision numbers, that I put in the vector lData[] using String.Split(). Every M lines, the data sequence starts over with a new data frame (hence the % M in the element index when i assign the values).

It is my understanding (clearly wrong) that by putting the code from the (serial) for-loop in the third parameter of Parallel.For I parallelize its execution. This shouldn't change the results. Is the problem in the fact that the threads are all accessing to lCount and M? Should I make thread-local copies?

Thanks.

(since I'm new I am not allowed to create the Parallel.For tag)

EDIT: I ran some more tests. Basically I looked at an output earlier in the code than what I did before. It would appear that the parallel version of my code does not fill the sg[][] array entirely. Rather, some values are left to their defaults (0, in my case).

EDIT 2 (to answer some of the comments): lData[] is a string[] obtained by using string.Split(). The original string I am splitting is read from my data files. I wrote the code that generates them, so they are generally well-formatted (I still used the try-catch construct out of habit). Just before the for-loop (wither parallel or serial) I check to verify that lData[] has the correct number of values (M). If it doesn't, I throw an exception that prevents the program from reaching the for-loop in question. sg[][] is a N by M array of type double (there was a typo in the snippets, now corrected; In my original code this error was not present). After I read N lines from the file the array sg[][] contains a whole data set. After the for-loop (either parallel or serial) there is a portion of come that looks like this: lCount++; //counting the lines I have already read if((lCount % N) == 0) { //do things with sg[][] //reset sg[][] } So, I am on purpose overwriting all lines of sg[][]. The for-loop's whole purpose is to update the values in sg[][].


Solution

  • After doing some line-by-line debugging over the weekend, I managed to find where the problem was.

    Basically, unbeknownst to me, the threads created by the parallel.for did not inherit the CultureInfo (this is the normal behaviour of threads, and I didn't know that). What was happening then was that strings like 3.256 were being parsed to 3256.0. This caused the issues I found with the output. (Note: the default locale on my computer is set to use a comma as decimal separator, but I had set to the full stop in program.cs for all my code. I had incorrectly assumed this would be inherited by new threads)

    The correct parallel snippet looks like this:

    CultureInfo newCulture = (CultureInfo)CultureInfo.CurrentCulture.Clone();
    newCulture.NumberFormat.NumberDecimalSeparator = ".";
    Parallel.For(0, M, i =>
    {
        Thread.CurrentThread.CurrentCulture = newCulture;
        double d;
        try
        {
            d = Double.Parse(lData[i]);
        }
        catch (Exception)
        {
            throw new Exception("Wrong formatting on data number " + (i + 1) + " on line " + (lCount + 1));
        }
        GlobalVar.sgData[lCount % N][i] = d;
    });
    

    Thanks to all who pitched in with comments and opinions. Good information to improve my programming.

    I updated the question tags to reflect where the issue really was.