is there any faster way to filter a zip file? My code reads the file line by line, so the data loads very slowly. Can I filter more than one line at a time?
$ZipPath = 'C:\Test\TestZip.zip'
Add-Type -assembly "system.io.compression.filesystem"
$zip = [io.compression.zipfile]::OpenRead($ZipPath)
$file = $zip.Entries[0]
$stream = $file.Open()
$reader = New-Object IO.StreamReader($stream)
$eachlinenumber = 1
while (($readeachline = $reader.ReadLine()) -ne $null)
{
$x = select-string -pattern "Order1" -InputObject $readeachline
Add-Content C:\text\TestFile.txt $x
}
$reader.Close()
$stream.Close()
$zip.Dispose()
The issue with your code is not because you're reading the content line-by-line, the actual issue is due to appending to a file on each loop iteration. I assume you're looking to have all lines matching Order1
from your Zip Entry added to TestFile.txt
, in which case you should consider using a StreamWriter
in combination with the StreamReader
. This will keep the File Stream opened while iterating over each line.
try {
Add-Type -AssemblyName System.IO.Compression.Filesystem
$zipPath = 'C:\Test\TestZip.zip'
$zipFile = [IO.Compression.ZipFile]::OpenRead($ZipPath)
$zipEntry = $zipFile.Entries[0]
$entryStream = $zipEntry.Open()
$reader = [IO.StreamReader]::new($entryStream)
$writer = [IO.StreamWriter]::new('C:\text\TestFile.txt')
while (-not $reader.EndOfStream) {
if(($line = $reader.ReadLine()) -match 'Order1') {
$writer.WriteLine($line)
}
}
}
finally {
$reader, $writer, $entryStream, $zipFile | ForEach-Object Dispose
}
If you're looking to simplify the process demonstrated above, reading a zip archive and replacing the content of zip archive entries, you might find it easier with the PSCompression Module (Disclaimer: I'm the author of this module).
This is how the code would look using the module:
Get-ZipEntry 'C:\Test\TestZip.zip' -EntryType Archive |
Select-Object -First 1 |
Get-ZipEntryContent |
Where-Object { $_ -match 'Order1' } |
Set-Content 'C:\text\TestFile.txt'