Search code examples
c#.netsql-serverfilestreamtar

How can I create a tar file in memory from several varbinary(max) columns stored in SQL Server using .NET native API?


I'm currently working on the task to do a HTTP GET request and download a tar-file from a bunch of files stored in SQL Server (Filestream) varbinary(max) columns. So construct a tar-file and write the varbinary data to that tar-file.

Can this be achieved with .NET native API:s, like TarWriter/TarEntry? I'm currently on .NET8. Or do I need to use third-party libs such as SharpZipLib?

Pseudo code:

    // Create a MemoryStream to store the .tar file in memory
    using var tarStream = new MemoryStream();
    await using var tarWriter = new TarWriter(tarStream);
        
    // Open a connection to SQL Server
    await using var connection = new SqlConnection("YourConnectionString");
    await connection.OpenAsync();
    var command = new SqlCommand("SELECT FileName, FileData FROM FilesTable", connection);

    await using var reader = command.ExecuteReader();
                
    while (reader.Read())
    {                            
       var fileName = reader.GetString(0);
       var fileDataStream = reader.GetStream(1);
       
       // How to construct TarEntry from a stream?
       tarWriter.WriteEntry(fileName, fileDataStream);
    }

    // Then write the tarStream to response stream.
   

Solution

  • Your pseudo-code is nearly right.

    • Use await reader.ReadAsync and command.ExecuteReaderAsync.
    • Set CommandBehavior.SequentialAccess for better performance and memory usage on large binaries.
    • Specify leaveOpen: true in the TarWriter constructor, otherwise the MemoryStream will be disposed.
    • To write an entry, you need to first create one, set its DataStream, then use tarWriter.WriteEntryAsync
    • Dispose the SQL stream.
    • Consider passing CancellationToken in each async function.
    • Then reset the position of the stream, and pass it back in a FileStreamResult.
    var tarStream = new MemoryStream();    // no need to dispose MemoryStream
    using (var tarWriter = new TarWriter(tarStream, leaveOpen: true))
    {
        await using var connection = new SqlConnection("YourConnectionString");
        const string query = @"
    SELECT
      FileName,
      FileData
    FROM FilesTable;
    ";
        await using var command = new SqlCommand(query, connection);
        await connection.OpenAsync();
        await using var reader = command.ExecuteReaderAsync(CommandBehavior.SequentialAccess);
    
        while (await reader.ReadAsync())
        {                            
            var fileName = reader.GetString(0);
            await using var fileDataStream = reader.GetStream(1);
    
            var entry = new PaxTarEntry(TarEntryType.RegularFile, fileName)
            {
                DataStream = fileDataStream,
            };
            await tarWriter.WriteEntryAsync(entry);
        }
    }
    
    tarStream.Position = 0;
    return new FileStreamResult(tarStream, "application/x-tar");