Search code examples
pythondockerevent-handlingdocker-volumewatchdog

How can I watch for new files in a Docker volume using Python?


I have a Python script that watches for new files in a directory on my local machine using the watchdog package and uploads them to a remote server. The code works fine when I run it on my local machine, but now I want to run it inside a Docker container, where the files will be stored in a volume mounted to the container.

The problem is that my current implementation of the watchdog script doesn't work with Docker volumes. I suspect that this is because the FileSystemEventHandler class is not detecting the new files in the volume.

I tried to modify the path argument of the Observer object to the mounted path of the Docker volume, but this didn't work. I expected the script to detect the new files in the volume and upload them to the remote server, just like it does on my local machine. However, when I run the script inside the Docker container, it doesn't seem to detect any new files in the volume.

Here's the relevant code:

class NewFileHandler(FileSystemEventHandler):
    def on_created(self, event):
        if event.is_directory:
            return
        file_path = event.src_path
        file_name = os.path.basename(file_path)
        upload_file(file_path, file_name)
if args.watch_new == 'True':
    print(f"Waiting for new files in directory {directory}...")
    sys.stdout.flush()

    event_handler = NewFileHandler()
    observer = Observer()
    observer.schedule(event_handler, directory, recursive=True)
    observer.start()

    try:
        while True:
            time.sleep(1)
            sys.stdout.flush()
    except KeyboardInterrupt:
        observer.stop()
    observer.join()

where directory is a volume mounted from .env by docker-compose.yml:

    volumes:
      - "${DIRECTORY}:/data"

Solution

  • I've figured that the polling method works in this case.

    def get_file_list(directory): 
        return [f for f in os.listdir(directory) if os.path.isfile(os.path.join(directory, f))]
    
    def wait_new_files(directory): 
        previous_files = get_file_list(directory)
        while True:
            current_files = get_file_list(directory)
            new_files = list(set(current_files) - set(previous_files))
            for new_file in new_files:
                print("New file created:", new_file)
                file_path = os.path.join(directory, new_file)
                file_name = os.path.basename(file_path)
                upload_file(file_path, file_name)
            previous_files = current_files
            time.sleep(1)
    

    Does anyone know of other ways to monitor a Docker volume for changes, or is the polling method the only viable option in this case?