Search code examples
multithreadingtask-parallel-libraryfilesystemwatchertpl-dataflow

Process all file change events within specified time frame with TPL Dataflow


I am monitoring multiple log files across multiple directories. I need to trigger an SSIS package when a file has fired an onchange event. Easy enough, but the complication is I don't want to trigger the SSIS package every time there is a change on the file. I want to wait and capture at least 5 minutes worth of changes to a specific file.

Having used FilewSystemWatcher before I know it triggers each onchange event in a new thread - My thought is to pass these events into a TPL block and have it wait for a specified time interval and then trigger an SSIS package. Basically triggering a related SSIS package every 5 minutes if there have been file change events.

If anyone could point me in the right direction as a starting point I would greatly appreciate it!


Solution

  • I can't tell from your question whether TPL Dataflow is a requirement or just an idea.

    I'd probably keep it simple and avoid TPL dataflow. Just poll using a System.Threading.Timer and have it check System.IO.File.GetLastWriteTime on the file.

    If you want to get fancy you could use Rx, convert the FilewSystemWatcher event to an Observable, and use the Buffer(TimeSpan) method.

    TPL Dataflow doesn't have any intrinsic support for time windows you'd have to roll your own, probably using one of the aforementioned two methods to build it out. My experience with TPL Dataflow is that it's too big and cumbersome for small tasks, and too rudimentary for big tasks, so I'd avoid taking that approach.