Search code examples
.nettpl-dataflow

TransformBlock posting to output


My scenario is that I have a BufferBlock<Stream> receiving Stream's from an external source, let's say the file system or some FTP server. These file Streams will pass into another block and undergo processing.

The only catch is that some of these files are zipped, and I would like to add a Block in the middle which would unzip files when necessary, and create multiple output Stream's for each of its entries.

However I do not want to use TransformBlockMany, because this means I have to fully receive the ZIP Stream and create the output Stream array at once.

I would want this Block to receive the ZIP Stream, start decompressing , and Push to the next stream whenever an Entry is ready, so the Process Block can start processing as soon as the first file is decompressed, and not wait until everything is decompressed.

How would I go around doing this?


Solution

  • I understood my problem is not being able to use an yield / async together . But after refactoring it , I got rid of that need, and came up with the following (simplified) version:

    var block = new TransformManyBlock<Stream, Stream>((input) => {
    var archive = new System.IO.Compression.ZipArchive(input, System.IO.Compression.ZipArchiveMode.Read, true);
    foreach (ZipArchiveEntry entry in archive.Entries)
    {
        if (string.IsNullOrWhiteSpace(entry.Name)) //is a folder
            continue;
    
        yield return entry.Open();
    
    }
    
    });