I am trying to speed up the execution by parallelizing the matching between a given pattern myPattern
and a long List
of patterns patterns
. The code I have below works fine if patterns
had only one pattern matching myPattern
but patterns
can have multiple patterns that will match and the Parallel.For
loop will write multiple i
and the loop will return multiple i's if they fell under different threads. I have sorted patterns
in a way that first pattern on the list is the one needed. This will work for serial version below but not for the parallel How can I break from the Parallel.For
loop at the lowest iteration that matches the condition set inside the loop? For example if patterns[i]
i = 26, i = 120, i= 1832 match, only get i = 26.
//Serial
for (int i = 0; i < patterns.Count; i++)
{
bool match = MyMatcher(myPattern, patterns[i]);
if (match)
{
Console.WriteLine(i.ToString());
break;
}
}
//Parallel
Parallel.For(0, patterns.Count, new ParallelOptions { MaxDegreeOfParallelism = 4 },
(int i, ParallelLoopState state) =>
{
bool match = MyMatcher(myPattern, patterns[i]);
if (match)
{
Console.WriteLine(i.ToString());
state.Stop();
}
});
Instead of the Stop
, you could use the Break
functionality of the Parallel.For
. The Break
is more relaxed than the Stop
.
It doesn't terminate the loop instantly. It terminates it when all lower indices in the patterns
list have been processed, giving the chance for a better match to be found.
Another issue is that the Parallel.For
employs range partitioning, which is not suitable for your case. The range partitioning strategy splits the list in ranges, and dedicates one thread per range. In your case it is desirable to process the lower indices first, because even if you find a match in an upper index, you would still need to search all the lower indices for better matches. You can prioritize the lower indices by switching to the Parallel.ForEach
method, and opting for dynamic partitioning with the Partitioner.Create
method. Either the Partitioner.Create(patterns)
or the Partitioner.Create(patterns, loadBalance: true)
will do the job. The first option enumerates the list using the GetEnumerator
of the list (synchronized), and the second enumerates it using the Count
and the indexer []
of the list. The second option should be slightly more efficient because it is lock-free (source code), but the first has clearer incremental enumeration semantics. In the example below I am using the first option:
var partitioner = Partitioner.Create(patterns);
ParallelOptions options = new() { MaxDegreeOfParallelism = 4 };
ParallelLoopResult result = Parallel.ForEach(partitioner, options, (p, state) =>
{
bool match = MyMatcher(myPattern, p);
if (match) state.Break();
});
if (result.LowestBreakIteration.HasValue)
Console.WriteLine(patterns[(int)result.LowestBreakIteration.Value]);
else
Console.WriteLine("Not found"); // The Break was not called
After the completion of the loop you can examine the ParallelLoopResult.LowestBreakIteration
, and see if the Break
was called, and what was the lower index that called it. This index corresponds to the pattern that you are searching for.