I made a program in C# where it processes about 30 zipped folders which have about 35000 files in total. My purpose is to read every single file for processing its information. As of now, my code extracts all the folders and then read the files. The problem with this process is it takes about 15-20 minutes for it to happen, which is a lot.
I am using the following code to extract files:
void ExtractFile(string zipfile, string path)
{
ZipFile zip = ZipFile.Read(zipfile);
zip.ExtractAll(path);
}
The extraction part is the one which takes the most time to process. I need to reduce this time. Is there a way I can read the contents of the files inside the zipped folder without extracting them? or if anyone knows any other way that can help me reduce the time of this code ?
Thanks in advance
You could try reading each entry into a memory stream instead of to the file system:
ZipFile zip = ZipFile.Read(zipfile);
foreach(ZipEntry entry in zip.Entries)
{
using(MemoryStream ms = new MemoryStream())
{
entry.Extract(ms);
ms.Seek(0,SeekOrigin.Begin);
// read from the stream
}
}