Search code examples
powershellzipsystem.io.compressionc#-ziparchive

Read zip file with powershell


is there any faster way to filter a zip file? My code reads the file line by line, so the data loads very slowly. Can I filter more than one line at a time?

$ZipPath = 'C:\Test\TestZip.zip'
Add-Type -assembly "system.io.compression.filesystem"
$zip = [io.compression.zipfile]::OpenRead($ZipPath)
$file = $zip.Entries[0]
$stream = $file.Open()
$reader = New-Object IO.StreamReader($stream)
$eachlinenumber = 1
while (($readeachline = $reader.ReadLine()) -ne $null)
{
    
    $x = select-string -pattern "Order1" -InputObject $readeachline 
    Add-Content C:\text\TestFile.txt $x
}  

$reader.Close()
$stream.Close()
$zip.Dispose()

Solution

  • The issue with your code is not because you're reading the content line-by-line, the actual issue is due to appending to a file on each loop iteration. I assume you're looking to have all lines matching Order1 from your Zip Entry added to TestFile.txt, in which case you should consider using a StreamWriter in combination with the StreamReader. This will keep the File Stream opened while iterating over each line.

    try {
        Add-Type -AssemblyName System.IO.Compression.Filesystem
    
        $zipPath = 'C:\Test\TestZip.zip'
        $zipFile = [IO.Compression.ZipFile]::OpenRead($ZipPath)
        $zipEntry = $zipFile.Entries[0]
        $entryStream = $zipEntry.Open()
        $reader = [IO.StreamReader]::new($entryStream)
        $writer = [IO.StreamWriter]::new('C:\text\TestFile.txt')
    
        while (-not $reader.EndOfStream) {
            if(($line = $reader.ReadLine()) -match 'Order1') {
                $writer.WriteLine($line)
            }
        }
    }
    finally {
        $reader, $writer, $entryStream, $zipFile | ForEach-Object Dispose
    }
    

    If you're looking to simplify the process demonstrated above, reading a zip archive and replacing the content of zip archive entries, you might find it easier with the PSCompression Module (Disclaimer: I'm the author of this module).

    This is how the code would look using the module:

    Get-ZipEntry 'C:\Test\TestZip.zip' -EntryType Archive |
        Select-Object -First 1 |
        Get-ZipEntryContent |
        Where-Object { $_ -match 'Order1' } |
        Set-Content 'C:\text\TestFile.txt'