Search code examples
c#.netasp.net-coresftpwinscp

List files inside ZIP file located on SFTP server in C#


I need to process folders inside a ZIP file from SFTP server (WinSCP) programmatically through ASP.NET Core.

Is there any way where I can get list of files inside ZIP file without downloading to local computer? As the file size would be high and won't be in a consistent manner. Any help would be appreciated.


Solution

  • With SSH.NET library, it could be as easy as:

    using (var client = new SftpClient(host, username, password)
    {
        client.Connect();
    
        using (Stream stream = client.OpenRead("/remote/path/archive.zip"))
        using (var archive = new ZipArchive(stream, ZipArchiveMode.Read))
        {
            foreach (var entry in archive.Entries)
            {
                Console.WriteLine(entry);
            }
        }
    }
    

    You need to reference System.IO.Compression assembly to get the ZipArchive.

    The code will only read (download) the ZIP central directory record, not whole ZIP archive. For a proof, see the end of the answer.


    Unfortunately, there's a bug in the library. To workaround it, you have to implement a wrapper Stream implementation like this:

    class FixStream : Stream
    {
        public override long Seek(long offset, SeekOrigin origin)
        {
            long result;
            // workaround for SSH.NET bug in implementation of SeekOrigin.End
            if (origin == SeekOrigin.End)
            {
                result = _stream.Seek(Length + offset, SeekOrigin.Begin);
            }
            else
            {
                result = _stream.Seek(offset, origin);
            }
            return result;
        }
    
        // passthrough implementation of the rest of Stream interface
    
        public override bool CanRead => _stream.CanRead;
    
        public override bool CanSeek => _stream.CanSeek;
    
        public override bool CanWrite => _stream.CanWrite;
    
        public override long Length => _stream.Length;
    
        public override long Position { 
            get => _stream.Position; set => _stream.Position = value; }
    
        public FixStream(Stream stream)
        {
            _stream = stream;
        }
    
        public override void Flush()
        {
            _stream.Flush();
        }
    
        public override int Read(byte[] buffer, int offset, int count)
        {
            return _stream.Read(buffer, offset, count);
        }
    
        public override void SetLength(long value)
        {
            _stream.SetLength(value);
        }
    
        public override void Write(byte[] buffer, int offset, int count)
        {
            _stream.Write(buffer, offset, count);
        }
    
        private Stream _stream;
    }
    

    And wrap the SftpFileStream to it:

    using (Stream stream = client.OpenRead("/remote/path/archive.zip"))
    using (var stream2 = new FixStream(stream))
    using (var archive = new ZipArchive(stream2, ZipArchiveMode.Read))
    {
        ...
    }
    

    As a proof that it really works, I've added logging to all methods of FixStream. When using the code with 18 MB (18265315 bytes) ZIP archive with two entries, the following was produced. So only 244 bytes were read from the stream. Actually more is read from the actual remote SFTP file, as SSH.NET buffers the reads (otherwise the code would be quite ineffective, particularly in this case, as you can see that ZipArchive does lot of small reads). The default SSH.NET buffer is 32 KB (SftpClient.BufferSize).

    Tried to seek to -18 from End => converting to seek to 18265297 from Begin
    Seeked to 18265297 from Begin => 18265297
    Seeked to -32 from Current => 18265265
    Tried to read 32, got 32
    Seeked to -32 from Current => 18265265
    Seeked to 28 from Current => 18265293
    Tried to read 4, got 4
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 4, got 4
    Tried to read 4, got 4
    Tried to read 2, got 2
    Seeked to 18265075 from Begin => 18265075
    Tried to read 4, got 4
    Tried to read 1, got 1
    Tried to read 1, got 1
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 4, got 4
    Tried to read 4, got 4
    Tried to read 4, got 4
    Tried to read 4, got 4
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 4, got 4
    Tried to read 4, got 4
    Tried to read 28, got 28
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 32, got 32
    Set position to 18265185
    Tried to read 4, got 4
    Tried to read 1, got 1
    Tried to read 1, got 1
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 4, got 4
    Tried to read 4, got 4
    Tried to read 4, got 4
    Tried to read 4, got 4
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 4, got 4
    Tried to read 4, got 4
    Tried to read 26, got 26
    Tried to read 2, got 2
    Tried to read 2, got 2
    Tried to read 32, got 32
    Set position to 18265293
    Tried to read 4, got 4