c#parallel-processing task-parallel-library parallel.foreach tpl-dataflow

Parallel.ForEach vs ActionBlock

For a given MaxDegreeOfParallelism and fixed amount of objects that need to be processed (i.e. have certain code executed on them) it would seem Parallel.ForEach and an ActionBlock would be equally useful.

What considerations would need to be taken into account when choosing one over the other?

Solution

Yes, both the Parallel.ForEach and the ActionBlock<T> can be used for the purpose of processing a list of items in parallel. Between these two the Parallel.ForEach is the more natural choise, because it communicates clearly its purpose and requires less study before using it. Both have gotchas that might catch you by surprise. Here are some things that you should have in mind:

The Parallel.ForEach processed the items in an order that depends on the type of the source. If it's a list or array the order will be quite peculiar, because the Parallel.ForEach will partition the list in ranges and will assign a worker task for each range (range partitioning). So you'll see the items to be processed like this: 1, 26, 51, 76, 2, 27, 52, 77..., instead of the quasi-sequential 1, 2, 4, 3, 5, 8, 6, 7 etc. If the source is an IEnumerable<T>, then the order will be the natural start-to-end. The ActionBlock<T> processes the items in the order that you Post them, so there are no surprises there.
When the source is an IEnumerable<T>, the Parallel.ForEach uses chunk partitioning by default, meaning that it doesn't grab just one item from the source at a time. It accumulates items in small chunks, and then starts processing them. This might catch you by surprise if for example your source is a BlockingCollection<T>. You will add an item in the collection and the Parallel.ForEach won't process it immediately, and you'll wonder why. The ActionBlock<T> takes items from its own buffer one-by-one, so no surprises there.
The ActionBlock<T> by default has MaxDegreeOfParallelism = 1 (i.e. no parallelism). On the contrary the Parallel.ForEach by default has MaxDegreeOfParallelism = -1 (i.e. unlimited parallelism). The Parallel.ForEach has by far the most dangerous default, because if you forget to configure the MaxDegreeOfParallelism it will quickly saturate your ThreadPool. With a saturated ThreadPool, other concurrent operations of your program will stutter.
The ActionBlock<T> has the annoying "by design" behavior of swallowing any OperationCancelledExceptions thrown by the action. So if processing an item can fail with an OperationCancelledException, i.e. if this exception denotes failure instead of cancellation, the ActionBlock<T> will complete happily with no exception like nothing happened, hiding from you that actually the processing of some items failed.

The peculiarities of the Parallel.ForEach that I mentioned earlier can be fixed easily by wrapping the source in an appropriate Partitioner, as shown in this answer. A more drastic fix is to switch to the newer Parallel.ForEachAsync API. Although the Parallel.ForEachAsync has Async in its name, it can process synchronous workloads just as easily and efficiently. Just return a ValueTask.CompletedTask from the body, and Wait the resulting Task. The Parallel.ForEachAsync employs no surprising chunking/partitioning strategies (not documented though). It processes the items in the natural start-to-end order. It is also less aggressive with owning the ThreadPool, since by default it has MaxDegreeOfParallelism equal to Environment.ProcessorCount, which is a sensible default for most scenarios. Its worker tasks are synchronized asynchronously when they take items from the source, so if the source is an empty BlockingCollection<T> only one thread will be blocked. It lacks some functionality that the Parallel.ForEach has, like breaking and getting the LowestBreakIteration, but these features are rarely used in practice.