I'm writing an application that will scan a directory for newly-added files then do some processing on them using WatchService. This portion is working as expected. Now, I need to write code to handle files that were added to the directory before the the service is started.
The naive approach would be to simply get a list of files in the folder before registering the path with the WatchService; I have concerns, though, that this may leave a gap between processing the pre-existing files and watching for new events, where I might miss incoming files. The safest option would be to start Watching for events, but not process them, until I've dealt with the files already present.
Is there some way to manually change the WatchKey's status to 'signalled'? This would accomplish my goals, but I'm not seeing a way to do this in WatchService's documentation.
The naive approach would be to simply get a list of files in the folder before registering the path with the WatchService; I have concerns, though, that this may leave a gap between processing the pre-existing files and watching for new events, where I might miss incoming files. The safest option would be to start Watching for events, but not process them, until I've dealt with the files already present.
Here is a way to address the concerns:
Start the watch-service
and get-files-list-from-folder-process
simultaneously.
Both these processes puts the file (file path) in a thread-safe queue collection - first - which allows last-in-first-out (LIFO) or first-in-first-out (FIFO) processing. One can consider java.util.concurrent.ConcurrentLinkedDeque
, LinkedBlockingDeque
, ConcurrentLinkedQueue
or LinkedBlockingQueue
based on requirement. This way all files are processed one after the other - irrespective of it is from the get-files-list-from-folder-process or the watch-service.
But, a check is required to avoid duplicate file processing to be sure there is no file added to the queue twice. This will be needed only at the beginning of the application. The actual file process program itself can track the files processed in another collection - which can be used to check if a file is already processed.