Search code examples
c#.netexceptiontask-parallel-librarytpl-dataflow

TPL Dataflow blocks run forever. Forever running producer, consumer modelling and exception handling


I am using TPL Dataflow library to implement a producer consumer scenario. The processing involves a pipeline of tasks. Dataflow library aptly suits my use case.

But I want to know how do I efficiently implement this use case [details below].

I want to use TPL Dataflow in a server type setting.

By a server type setting I mean production of data stream takes place continuously [asynchronously] forever. Consumption task also runs forever and consumes all the data produced by a producer[asynchronusly]. Thus my blocks run forever

How do I model this scenario efficiently? Moreover how can I deal with exception handling as I cannot Wait() [Without a call to Wait() I would not be able to catch the exceptions thrown on a faulted block as far as my understanding goes].


Solution

  • Exceptions

    I usually wrap the delegates with exception handling because as you said a block's exception is stored in the Completion task and moreover a faulted block stays faulted so you would need to replace it to move on.

    var block = new TransfromBlock<string, int>(number =>
    {
        try
        {
            return int.Parse(number);
        }
        catch (Exception e)
        {
            Trace.WriteLine(e);
        }
    });
    

    Capacity

    Another important issue is capping. If some part of your workflow can't handle the load, it's input queue would simply grow infinitely. That could lead to a memory leak or OutOfMemoryExceptions. So it's important to make sure to limit all your blocks with the appropriate BoundedCapacity and decide what to do when that limit is reached ("throw" items, save to storage, etc.)

    Parallelism

    While the default value for BoundedCapacity is -1 (unbounded), the default value for MaxDegreeOfPrallelism is 1 (no parallelism). Most applications can easily benefit from parallelism so make sure to set an appropriate MaxDegreeOfPrallelism value. When a block's delegate is purely CPU-intensive MaxDegreeOfPrallelism shouldn't be much higher than the available cores. As it has less CPU and more I/O-intensive parts the MaxDegreeOfPrallelism can be increased.

    Conclusion

    Using TPL dataflow throughout the application's lifetime is really simple. Just make sure to enable configuration through the app.config and tweak according to actual results "in the field".