I am currently trying to improve a C# project I am working on. Specifically, my goal is to parallelize some operations to reduce processing time. I am starting with small snippets just to get the hang of it. The following code (not parallel) works correctly (as expected)
for (int i = 0; i < M; i++)
{
double d;
try
{
d = Double.Parse(lData[i]);
}
catch (Exception)
{
throw new Exception("Wrong formatting on data number " + (i + 1) + " on line " + (lCount + 1));
}
sg[lCount % N][i] = d;
}
By using the following (parallel) code I would expect to obtain the exact same results, but that is not the case.
Parallel.For(0, M, i =>
{
double d;
try
{
d = Double.Parse(lData[i]);
}
catch (Exception)
{
throw new Exception("Wrong formatting on data number " + (i + 1) + " on line " + (lCount + 1));
}
sg[lCount % N][i] = d;
});
The part of the program these snippets are from reads data from a file, one line at a time. Each line is a sequence of comma-separated double precision numbers, that I put in the vector lData[] using String.Split(). Every M lines, the data sequence starts over with a new data frame (hence the % M
in the element index when i assign the values).
It is my understanding (clearly wrong) that by putting the code from the (serial) for-loop in the third parameter of Parallel.For
I parallelize its execution. This shouldn't change the results. Is the problem in the fact that the threads are all accessing to lCount and M? Should I make thread-local copies?
Thanks.
(since I'm new I am not allowed to create the Parallel.For
tag)
EDIT:
I ran some more tests. Basically I looked at an output earlier in the code than what I did before. It would appear that the parallel version of my code does not fill the sg[][]
array entirely. Rather, some values are left to their defaults (0, in my case).
EDIT 2 (to answer some of the comments):
lData[]
is a string[]
obtained by using string.Split()
. The original string I am splitting is read from my data files. I wrote the code that generates them, so they are generally well-formatted (I still used the try-catch
construct out of habit). Just before the for-loop (wither parallel or serial) I check to verify that lData[]
has the correct number of values (M). If it doesn't, I throw an exception that prevents the program from reaching the for-loop in question.
sg[][]
is a N by M array of type double
(there was a typo in the snippets, now corrected; In my original code this error was not present). After I read N lines from the file the array sg[][]
contains a whole data set. After the for-loop (either parallel or serial) there is a portion of come that looks like this:
lCount++; //counting the lines I have already read
if((lCount % N) == 0)
{
//do things with sg[][]
//reset sg[][]
}
So, I am on purpose overwriting all lines of sg[][]
. The for-loop's whole purpose is to update the values in sg[][]
.
After doing some line-by-line debugging over the weekend, I managed to find where the problem was.
Basically, unbeknownst to me, the threads created by the parallel.for
did not inherit the CultureInfo (this is the normal behaviour of threads, and I didn't know that). What was happening then was that strings like 3.256
were being parsed to 3256.0
. This caused the issues I found with the output.
(Note: the default locale on my computer is set to use a comma as decimal separator, but I had set to the full stop in program.cs for all my code. I had incorrectly assumed this would be inherited by new threads)
The correct parallel snippet looks like this:
CultureInfo newCulture = (CultureInfo)CultureInfo.CurrentCulture.Clone();
newCulture.NumberFormat.NumberDecimalSeparator = ".";
Parallel.For(0, M, i =>
{
Thread.CurrentThread.CurrentCulture = newCulture;
double d;
try
{
d = Double.Parse(lData[i]);
}
catch (Exception)
{
throw new Exception("Wrong formatting on data number " + (i + 1) + " on line " + (lCount + 1));
}
GlobalVar.sgData[lCount % N][i] = d;
});
Thanks to all who pitched in with comments and opinions. Good information to improve my programming.
I updated the question tags to reflect where the issue really was.