Please take a look at my TPL Dataflow network scheme below. There are a list of URLs, a number of Load blocks and a Parse block. Load blocks load HTML pages with different proxy servers and all of them linked to Parse block where CPU-bound work happens. If there were an exception during page loading, URL adds back to list.
I post URLs to Load blocks with a hand-made cycle (on pic). My question: Is there any block type, that can help with choosing Load block to post URLs to instead my hand-made cycle? For example it will post URLs to first Load clock with .InputCount <=2.
And one more. Proxy server can become unavailable during Dataflow execution. I think if I place a BufferBlock instead URLs List, then I will be able to dynamically unlink LoadBlocks with dead proxy from this BufferBlock if there is such ability. So is there a way to dynamically unlink blocks from network?
Is there any block type, that can help with choosing Load block to post URLs to instead my hand-made cycle? For example it will post URLs to first Load clock with .InputCount <=2.
What you can do is to have a single BufferBlock
that is linked to all the load blocks. You would then set BoundedCapacity
of the load blocks to something like 3 (1 item being processed + 2 in the input and output queues). With this setup, items will wait in the BufferBlock
until space becomes available in one of the load blocks.
is there a way to dynamically unlink blocks from network?
Yes, LinkTo()
returns an IDisposable
which can be used to destroy that link (by calling Dispose()
).