Search code examples
task-parallel-librarytpl-dataflow

TPL dataflow, MaxDegreeOfParallelism vs load balancing


Through TPL dataflow one can assign MaxDegreeOfParallelism for the block to run in parallel, another way is to 'load balance' the load by connecting the source to multiple target by limit the BoundedCapacity of each target block.

Question, what is the difference with two approaches, why do I even bother with load balance if I can just set the MaxDegreeOfParallelism?


Solution

  • MaxDegreeOfParallelism enforces how many instances of the task can run concurrently at the task scheduler level which is absolutely what you want as you can easily set it per invocation of your dataflow network. 'Load balancing' as you call it is not really a realistic concept in this case.

    Linking to multiple targets requires either a Broadcast block (which you don't want as it would publish to all target blocks) or a predicate to decide the flow. This would get messy and very stateful as you would have to track which block has the next message. TPL is more in line with the Actor model where the message is the state so trying to introduce such dynamic linking state is not really in keeping with the model.