Search code examples
c#performancestreamasp.net-core-webapimemory-efficient

Handling large byte arrays in C#


I've a simple ASP.NET Core Web API that processes large size documents (> 10 MB < 50 MB). Basically it reads a document from CRM like Salesforce processes it with Aspose and send the processed documents to multiple destinations like Salesforce, Email etc.

Instead of using byte array I thought of using streams but my question is after I process the document I get an output stream and how can I send a single output stream to multiple systems in parallel? Since streams are single threaded do I need to clone the stream? Cloning will again cause memory issues right? How we can handle large size documents in a memory efficient way and yet I can send to multiple destinations in parallel.


Solution

  • Indeed, a single Stream cannot reasonably be shared by multiple consumers. Making suggestions here would really need a lot more context of how these systems work and what kind of access they need to the data, but a few thoughts off-hand:

    • in either case: leased memory (think ArrayPool<byte>.Shared)
    • incremental: some kind of push model (rather than pull), so as the source iterates it pushes some chunk of payload to all consumers in turn, and what they do with it is up to them
    • all-at-once: some kind of discontinuous chain, maybe ReadOnlySequence<byte>, and have all consumers indicate back when they're done with it, so you can recycle the chunks