Search code examples
c#.nettask-parallel-librarytpl-dataflow

Skip Item in Dataflow TransformBlock


TPL Dataflow provides a TransformBlock for transforming input, e.g.:

var tb = new TransformBlock<int, int>(i => i * 2);

Is it possible to not output some of the input, e.g. if the input fails some validation test?

var tb = new TransformBlock<InputType, OutputType>(i =>
{
    if (!ValidateInput(i))
    {
        // Do something to not output anything for this input
    }
    // Normal output
}

If that is not possible, what would be the best pattern to achieve that end?
Something like the following?

BufferBlock<OutputType> output = new BufferBlock<OutputType>();

var ab = new ActionBlock<InputType>(i =>
{
    if (ValidateInput(i)) 
    {
        output.Post(MyTransform(i));
    }
}

Solution

  • There are several options on how to do this:

    1. Use TransformManyBlock as Jon suggested and return a collection containing 1 or 0 items.
    2. Use TransformBlock with some special value representing “no value” (e.g. null) and then use a LinkTo() with filter to remove those. You also have to link the TransformBlock to null block (DataflowBlock.NullTarget<T>()) without a filter, to drain the special values.
    3. I would consider this something of a hack, but you can also use the Task-based constructor of TransformBlock: use Task.FromResult() when you want to return something and null when you don't. For example:

      new TransformBlock<int, int>(i => i % 2 == 0 ? Task.FromResult(i * 2) : null)