python-multithreading python-3.4 python-watchdog

understanding this multithreading demon python code

So I am a beginner in python and am working on a filesystem event handler. I came across watchdog api and there I saw a multithreading code that I cannot understand.

Here is the code that is published on their website:

import sys
import time
import logging
from watchdog.observers import Observer
from watchdog.events import LoggingEventHandler

if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO,
                        format='%(asctime)s - %(message)s',
                        datefmt='%Y-%m-%d %H:%M:%S')
    path = sys.argv[1] if len(sys.argv) > 1 else '.'
    event_handler = LoggingEventHandler()
    observer = Observer()
    observer.schedule(event_handler, path, recursive=True)
    observer.start()
    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        observer.stop()
    observer.join()

This code runs an infinite loop and listens on some folder and logs what is sees to the console. My doubt is way on the bottom of the code.

So you start the observer. Then ask it to go on an infinite loop until some keypress is done. I am assuming that somewhere in "observer.start()" code, they are also setting daemon=True. Upon some keypress, the program runs out of loop and stops the observer. In watchdog's api, the definition of stop() says that it stops the daemon thread.

1) Then it does a join(). But what is the need for this join. I have already stopped the daemon thread. Isn't join() means that wait for all the threads to stop and then and only then exit the program. Can I remove the join() from the code. After I remove it, my program still works correctly.

2) I also don't understand the need of sleep(1) inside the while loop. What will happen if I just put a "pass" statement there. I am assuming that the while loop will consume more resources??? And the reason that we have put sleep time as 1 second and not 2-3 seconds because then in worst case, the user might have to wait for 2-3 seconds for the program to close. But I might be wrong.

Solution

Remember that the daemon is running in the parent process's, well, process. You need to keep the parent process alive while that thread is executing, or else it would be killed as the program exited (and likely in a not graceful way). That join makes sure the process stays alive until all threads actually exit; just because you called stop doesn't guarantee the thread has actually completed execution. stop is a request for the thread to stop, it doesn't require to block until the thread terminates (nor should it so that a parent thread can call stop on many child threads 'at once').
This is purely for reduced CPU consumption. If you simply had a pass in
there, the CPU would run that while loop as fast as possible, waisting cycles. The sleep call voluntarily yields the CPU to other processes since it knows it isn't going to need to respond quickly to any particular conditions. And you are essentially correct, it's sleep(1) so that your worst-case response time is approximately 1 second.

UPDATE:

Here is an example of why having a join is important. Say the following was running in a thread:

while not self.stop:  # self.stop is set to True when stop() is called
    ...
    self.results.append(item) # do some stuff that involves appending results to a list
with open('~/output.txt', 'w') as outfile:
    outfile.write('\n'.join(str(item) for item in item))

When stop is called, the while loop will terminate, and the result file will open and start writing. If the join wasn't called, the process could terminate before the write operation completes, which would cause corrupted results. The join ensures that the parent thread waits for this write to finish. It also ensures that the process actually waits for a whole iteration of that while-loop to finish; without the join you could not only miss the file write, but also terminate in the middle of that while block.

If however the thread that had stop called on it didn't do anything lengthy after the while terminated, join would effectively return instantly and so basically turns into a NOP.

UPDATE 2:

With respect to the sleep call, certain events (such as ctrl+c) can bubble out of even a sleep call on the parent process. So in this particular case, the length of the sleep doesn't really matter all that much. Setting it to 1 second is mostly just convention to make it clear that you're basically doing a 'yield CPU' rather than really sleeping sleeping.