Search code examples

Pipelines buffer preserving until processing is complete

I am researching the possibility of using pipelines for processing binary messages coming from network. The binary messages i will be processing come with an payload and it is desirable to keep the payload in its binary form.

The idea is to read out the whole message and create a slice of message and its payload, once the message is completely read it will be passed to a channel chain for processing, the processing will not be instant and might take some time or be executed later and the goal is not to have the pipe reader wait until the processing is complete, then once the message processing is complete i would need to release the processed buffer region to the pipe writer.

Now of course i could just create a new byte array and copy the data coming from pipe writer but that would beat the purpose of no-copy? So as i understand i would need some buffer synchronization between the pipeline and the channel? I observed the available apis (AdvanceTo) of pipe reader where its possible to tell the pipe reader what was consumed and what was examined but cant get around how this could be synced outside of the pipe reading method.

So the question would be whether there are some techniques or examples on how this can be achieved.


  • The buffer obtained from TryRead/ReadAsync is only valid until you call AdvanceTo, with the expectation that as soon as you've done that: anything you reported as consumed is available to be recycled for use elsewhere (which could be parallel/concurrent readers). Strictly speaking: even the bits you haven't reported as consumed: you still shouldn't treat as valid once you've called AdvanceTo (although in reality, it is likely that they'll still be the same segments - just: that isn't the concern of the caller; to the caller, it is only valid between the read and the advance).

    This means that you explicitly can't do:

    while (...)
        var result = await pipe.ReadAsync();
        if (TryIdentifyFrameBoundary(out var frame)) {
            BeginProcessingInBackground(frame); // <==== THIS IS A PROBLEM!
            reader.AdvanceTo(frame.End, frame.End);
        else if { // take nothing
            reader.AdvanceTo(buffer.Start, buffer.End);
            if (result.IsCompleted) break; // that's all folks

    because the "in background" bit, when it fires, could now be reading someone else's data (due to it being reused already).

    So: either you need to process the frame contents as part of the read loop, or you're going to have to make a copy of the data, most likely by using:

    var len = checked ((int)buffer.Length);
    var oversized = ArrayPool<byte>.Shared.Rent(len);

    and pass oversized to your background processing, remembering to only look at the first len bytes of it. You could pass this as a ReadOnlyMemory<byte>, but you need to consider that you're also going to want to return it to the array-pool afterwards (probably in a finally block), and passing it as a memory makes it a little more awkward (but not impossible, thanks to MemoryMarshal.TryGetArray).

    Note: in early versions of the pipelines API, there was an element of reference-counting, which did allow you to preserve buffers, but it had a few problems:

    • it complicated the API hugely
    • it led to leaked buffers
    • it was ambiguous and confusing what "preserved" meant; is the count until it gets reused? or released completely?

    so that feature was dropped.