Search code examples
c#.net.net-coreazure-blob-storagec#-ziparchive

Download large zip file from azure blob and unzip


I have currently below code which downloads a zip file from blob using SAS URI, unzips it and uploads the content to a new container

        var response = await new BlobClient(new Uri(sasUri)).DownloadAsync();
        using (ZipArchive archive = new ZipArchive(response.Value.Content))
        {
            foreach (ZipArchiveEntry entry in archive.Entries)
            {
                BlobClient blobClient = _blobServiceClient.GetBlobContainerClient(containerName).GetBlobClient(entry.FullName);
                using (var fileStream = entry.Open())
                {
                    await blobClient.UploadAsync(fileStream, true);
                }
            }
        }

The code for me fails with "stream too long" exception: System.IO.IOException: Stream was too long. at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count) at System.IO.Stream.CopyTo(Stream destination, Int32 bufferSize) at System.IO.Compression.ZipArchive.Init(Stream stream, ZipArchiveMode mode, Boolean leaveOpen).

My zip file size is 9G. What would be a better way to get around this exception? I'd like to avoid writing any files to disk.


Solution

  • Below solution worked for me. Instead of using DownloadAsync, use OpenReadAsync

    var response = await new BlobClient(new Uri(sasUri)).OpenReadAsync(new BlobOpenReadOptions(false), cancellationToken);
    using (ZipArchive archive = new ZipArchive(response))
    {
        foreach (ZipArchiveEntry entry in archive.Entries)
        {
            BlobClient blobClient = _blobServiceClient.GetBlobContainerClient(containerName).GetBlobClient($"{buildVersion}/{entry.FullName}");
            using (var fileStream = entry.Open())
            {
               await blobClient.UploadAsync(fileStream, true, cancellationToken).ConfigureAwait(false);
            }
        }
    }