Search code examples
c#.net-corewindows-servicesfilesystemwatcher

How to watch around 100 folders in parallel? FileSystemWatcher or other more efficient options?


  • I have a base folder under a drive Data and under this I have around 100 folders.

enter image description here

  • In each folder Folder1.....100, one of the 3rd part application pushing zip file (zip contains 1 or more files).

  • I have to write a window service which will watch all 100 folders for file arrival.

  • Once file is available I need to extract the zip file and placing all the extracted files into a second folder and this I need to do for each folder (Folder 1 .. 100) as soon as files available.

  • Below code suggest me that through C# FileSystemWatcher, I can watch one folder at a time and act on that.

Question is, how to do watch for 100 folders in parallel?

 class ExampleAttributesChangedFiringTwice
{
    public ExampleAttributesChangedFiringTwice(string demoFolderPath)
    {
        var watcher = new FileSystemWatcher()
        {
            Path = demoFolderPath,
            NotifyFilter = NotifyFilters.LastWrite,
            Filter = "*.txt"
        };

        watcher.Changed += OnChanged;
        watcher.EnableRaisingEvents = true;
    }

    private static void OnChanged(object source, FileSystemEventArgs e)
    {
        // extract zip file, do the validation, copy file into other destination
    }
}

The target folder, is it the same folder regardless of the source folder of the zip? That is, it doesn't matter if it's from Folder1 or Folder2, both will be extracted to FolderX?

Target folder is common for all "C:\ExtractedData".

So every folder under Data will be watched? No "blacklisted" folder? What about if a zip appears in Data itself instead of its subfolder? What if a new subfolder is created, should it be watched too?

"zip" always comes inside "subfolders", it will never create inside Data folder. Yes, there is a chance in future, more subfolders will come and need watch.

And does the extracted files goes into a separate subfolder inside the target folder based on their zip filename, or do they just get extracted on the target folder, eg, if it's A.zip, does the content goes to Target\A or just Target.

For example, if A.zip contains 2 files, "1.txt" and "2.txt", then both files goes to "C:\ExtractedData". This will be common for each zip files arrives at different subfolders.


Solution

  • The "100 folders in parallel" part turn out to be a red herring. Since all the new zip files are treated the same regardless of where they show up, just adding IncludeSubdirectories=true is enough. Note the following codes are prone to exceptions, read the comments

    class WatchAndExtract
    {
        string inputPath, targetPath;
        public WatchAndExtract(string inputPath, string targetPath)
        {
            this.inputPath = inputPath;
            this.targetPath = targetPath;
            var watcher = new FileSystemWatcher()
            {
                Path = inputPath,
                NotifyFilter = NotifyFilters.FileName,
                //add other filters if your 3rd party app don't immediately copy a new file, but instead create and write
                Filter = "*.zip",
                IncludeSubdirectories = true
            };
            watcher.Created += OnCreated; //use Changed if the file isn't immediately copied
            watcher.EnableRaisingEvents = true;
        }
    
        private void OnCreated(object source, FileSystemEventArgs e)
        {
            //add filters if you're using Changed instead 
            //https://stackoverflow.com/questions/1764809/filesystemwatcher-changed-event-is-raised-twice
            ZipFile.OpenRead(e.FullPath).ExtractToDirectory(targetPath);
            //this will throw exception if the zip file is being written.
            //Catch and add delay before retry, or watch for LastWrite event that already passed for a few seconds
        }
    }
    

    If it skipped some files, you either have too many files created at once and/or too big zip to process. Either increase the buffer size or start them in new thread. On HDD with busy IO or extremely large zip files, the events might exceed the storage capability and skipped files after a prolonged busy period, you'll have to consider writing to a different physical (not just a different partition in the same device) drive instead. Always verify with your predicted usage pattern.