Search code examples
pythonpython-2.7pywin32python-watchdog

Python watchdog duplicate events


I've created a modified watchdog example in order to monitor a file for .jpg photos that have been added to the specific directory in Windows.

import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

paths = []

xp_mode = 'off'

class FileHandler(FileSystemEventHandler):

    def on_created(self, event):
        if xp_mode == 'on':
            if not event.is_directory and not 'thumbnail' in event.src_path:
                print "Created: " + event.src_path
                paths.append(event.src_path)

    def on_modified(self, event):
        if not event.is_directory and not 'thumbnail' in event.src_path:
            print "Modified: " + event.src_path
            paths.append(event.src_path)

if __name__ == "__main__":
    path = 'C:\\'
    event_handler = FileHandler()
    observer = Observer()
    observer.schedule(event_handler, path, recursive=True)
    observer.start()
    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        observe r.stop()

    observer.join()

One of the things that I have noticed that when a file is added, both on_created and on_modified is called! To combat this problem, I decided to only use the on_modified method. However, I am starting to notice that this also causes multiple callbacks, but this time to the on_modified method!

Modified: C:\images\C121211-0008.jpg
Modified: C:\images\C121211-0009.jpg
Modified: C:\images\C121211-0009.jpg <--- What?
Modified: C:\images\C121211-0010.jpg
Modified: C:\images\C121211-0011.jpg
Modified: C:\images\C121211-0012.jpg
Modified: C:\images\C121211-0013.jpg

I cannot figure out for the life of me why this is happening! It doesn't seem to be consistent either. If anyone could shed some light on this issue, it will be greatly appreciated.

There was a similar post, but it was for Linux: python watchdog modified and created duplicate events


Solution

  • When a process writes a file, it first creates it, then writes the contents a piece at a time.

    What you're seeing is a set of events corresponding to those actions. Sometimes the pieces are written quickly enough that Windows only sends a single event for all of them, and other times you get multiple events.

    This is normal... depending on what the surrounding code needs to do, it might make sense to keep a set of modified pathnames rather than a list.