Search code examples
c#csvgzipcsvhelpergzipstream

Why do I get a corrupted *.gz file after creating it with C#?


I’m developing a .NET 5, C# console application to create a CSV file from a list of custom objects, gzip it, and upload it to an Azure Storage container with this code:

var blobServiceClient = new BlobServiceClient("My connection string");
var containerClient = GetBlobContainerClient("My container name");
var config = new CsvConfiguration(CultureInfo.CurrentCulture) { Delimiter = ";", Encoding = Encoding.UTF8 };

var list = new List<FakeModel>
{
   new FakeModel { Field1 = "A", Field2 = "B" },
   new FakeModel { Field1 = "C", Field2 = "D" }
};

await using var memoryStream = new MemoryStream();
await using var streamWriter = new StreamWriter(memoryStream);
await using var csvWriter = new CsvWriter(streamWriter, config);

await csvWriter.WriteRecordsAsync(list);

await using var zip = new GZipStream(memoryStream, CompressionMode.Compress, true);
await memoryStream.CopyToAsync(zip);

memoryStream.Seek(0, SeekOrigin.Begin);
var blockBlob = containerClient.GetBlockBlobClient("test.csv.gz");
await blockBlob.UploadAsync(memoryStream);

It appears to work, but when I download the gzip from the cloud to check it, I get the following error when trying to decompress it:

Error message from 7-Zip saying the archive is invalid

Inspection of the file shows it has a length of 0.

Can you help me understand why?


Solution

  • The stream parameter of a new GZipStream is the destination stream. To process the input to the output, you need to write to the instance of GZipStream somehow.

    When I experimented with it, I found that a call to csvWriter.FlushAsync() was necessary.

    Like this:

    class Program
    {
        public class FakeModel
        {
            public string Field1 { get; set; }
            public string Field2 { get; set; }
        }
    
        static async Task Main(string[] args)
        {
            var config = new CsvConfiguration(CultureInfo.CurrentCulture) { Delimiter = ";", Encoding = Encoding.UTF8 };
    
            var list = new List<FakeModel>
                            {
                               new FakeModel { Field1 = "A", Field2 = "B" },
                               new FakeModel { Field1 = "C", Field2 = "D" }
                            };
    
            await using var memoryStream = new MemoryStream();
            await using var streamWriter = new StreamWriter(memoryStream);
            await using var csvWriter = new CsvWriter(streamWriter, config);
    
            await csvWriter.WriteRecordsAsync(list);
            await csvWriter.FlushAsync();
    
            memoryStream.Position = 0;
    
            await using var fs = new FileStream(@"C:\temp\SO67935249.csv.gz", FileMode.Create);
            await using var zip = new GZipStream(fs, CompressionMode.Compress, true);
            await memoryStream.CopyToAsync(zip);
    
        }
    }