Search code examples
pythondockerwatchdog

watchdog.observers.Observer works in Windows, works in docker on Linux, does not work in docker on Windows


I have an interesting problem that is driving me nuts. I have a python program that is using watchdog.observers.Observer. This program (aka watcher) watches a folder and responds when files appear in it. I have another program (aka parser) which periodically populates the watched folder with files.

  1. When the watcher program runs in Windows and the parser runs in a docker container on Windows, there is happiness.
  2. When the watcher program runs in a docker container on a Linux box and the parser runs in another docker container on the Linux box, there is happiness.
  3. When the watcher program runs in a docker container on Windows and the parser runs in another docker container on Windows, happiness is not achieved. The parser populates the folder with files, but the watcher never observes them.

Here's my watcher code:

import os
import sys
import time
   
from watchdog.observers import Observer
from event_handler import ImagesEventHandler
from constants import ROOT_FOLDER, IMAGES_FOLDER, CWD


class ImagesWatcher:
    def __init__(self, src_path):
        self.__src_path = src_path
        print(self.__src_path)
        self.__event_handler = ImagesEventHandler()
        self.__event_observer = Observer()
        print("********** Inside ImagesWatcher --init__ method just after instantiating ImagesEventHandler and Observer **************")

    def run(self):
        print("********** Inside ImagesWatcher run method **************")
        self.start()
        try:
            while True:
                time.sleep(1)
        except KeyboardInterrupt:
            self.stop()

    def start(self):
        print("********** Inside ImagesWatcher start method **************")
        self.__schedule()
        self.__event_observer.start()

    def stop(self):
        print("********** Inside ImagesWatcher stop method **************")
        self.__event_observer.stop()
        self.__event_observer.join()

    def __schedule(self):
        print("********** Inside ImagesWatcher __schedule method **************")
        print(self.__src_path)
        self.__event_observer.schedule(
            self.__event_handler,
            self.__src_path,
            recursive=True
        )

if __name__ == "__main__":
    src_path = sys.argv[1] if len(sys.argv) > 1 else CWD
    src_path = os.path.abspath(src_path)
    watch_path = os.path.join(src_path, ROOT_FOLDER)
    watch_path = os.path.join(watch_path, IMAGES_FOLDER)
    print('watch_path: ' + watch_path)

    if not os.path.exists(watch_path):
        os.makedirs(watch_path)
        print('just created: ' + watch_path)

    ImagesWatcher(watch_path).run()

Here's the associated event handler code:

import os
from PIL import Image
from watchdog.events import FileSystemEventHandler
from lambda_function import lambda_handler
from time import sleep
from os.path import dirname, abspath

class ImagesEventHandler(FileSystemEventHandler):

    def __init__(self,):
        print("********** Inside event handler __init__ method **************")
    
    def on_created(self, event):
        print("********** Inside event handler on_created method **************")
        self.process(event)

    def process(self, event):
        print("********** Inside event handler process method **************")
        sleep(2)
        image = Image.open(event.src_path)
        tracking_dir=os.path.join(dirname(dirname(abspath(event.src_path))),'Tracking')
        print("********************  tracking_dir: ' + tracking_dir + ' ********************")
        lambda_handler(image,tracking_dir)

The stop method of the watcher is never executed. The init method of the event handler is executed, but neither the on_created nor the process methods are executed.

Here's how I build and run the docker containers:

docker build -t watcher -f docker/watcher/Dockerfile . 
docker run -d --network onprem_network -v c:\My_MR:/code/My_MR --name watcher watcher 

docker build -t parser -f docker/parser/Dockerfile . 
docker run -d --network onprem_network -v c:\My_MR:/code/My_MR --name parser parser 

My watcher Dockerfile:

FROM python:3.7.9
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
COPY requirements.txt /requirements.txt
RUN pip install --upgrade pip -r /requirements.txt && mkdir /code 
WORKDIR /code
COPY . /code/
RUN apt update && apt-get update && apt install tesseract-ocr -y && apt-get install ffmpeg libsm6 libxext6  -y
CMD ["python", "/code/watcher.py"]

My parser Dockerfile:

FROM python:3.7.9
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
COPY requirements.txt /requirements.txt
RUN pip install --upgrade pip -r /requirements.txt && mkdir /code
WORKDIR /code
COPY . /code/
RUN apt update && apt-get update && apt-get install ffmpeg -y
CMD ["python", "/code/parser.py"]

My requirements.txt:

Pillow == 5.4.1
gql == 3.0.0a5
matplotlib == 3.0.3
numpy == 1.16.2
opencv_python == 4.4.0.44
pandas == 0.24.2
pytesseract == 0.2.6
python_ffmpeg_video_streaming == 0.1.14
watchdog == 2.0.2
requests
tesseract

Any help would be greatly appreciated.


Solution

  • The underlying API that watchdog uses to monitor linux filesystem events is called inotify. The Docker for Windows WSL 2 backend documentation notes:

    Linux containers only receive file change events (“inotify events”) if the original files are stored in the Linux filesystem.

    The directory you're mounting, c:\My_MR, resides on the Windows file system and thus inotify inside the watcher container doesn't work.

    Instead, you can run docker from inside your WSL 2 default distribution with a linux filesystem path, e.g., ~/my_mr:

    docker run -d --network onprem_network -v ~/my_mr:/code/My_MR --name watcher watcher 
    docker run -d --network onprem_network -v ~/my_mr:/code/My_MR --name parser parser 
    

    This directory can be accessed from Windows while that WSL 2 distribution is running using the \\wsl$\ network path, i.e., \\wsl$\<Distro name>\home\<username>\my_mr (more info here). Accordingly, I believe docker run could also be used from Windows using the \\wsl$\ path with -v.