One of my processes is writing data into text file and then a sql stored procedure stages that data in one of the sql table. As of now I am not sure about the timing of the file, so I need a file watcher that will look for that file and when that file will be available it will stage that data into sql table.
I have tried the below piece of code but I am not able to stop and execute sql stored procedure when I get that file. For ex: filename is Process1_Timestamp.txt
.
I have created the below process:
Created function to return files in a directory.
Created a function to compare two list.
And then this:
def fileWatcher(my_dir: str, pollTime: int):
while True:
if 'SeeFiles' not in locals(): #Check if this is the first time the function has run
previousFileList = fileInDirectory(watchDirectory)
watching = 1
print('First attempt')
print(previousFileList)
time.sleep(pollTime)
newFileList = fileInDirectory(watchDirectory)
fileDiff = listComparison(previousFileList, newFileList)
previousFileList = newFileList
if len(fileDiff) == 0: continue
doThingsWithNewFiles(fileDiff)
How I can stop looking when I get that file and trigger the next sql process?
Have you looked at Watchdog https://pythonhosted.org/watchdog/
Largely taken from the example on https://pythonhosted.org/watchdog/quickstart.html#a-simple-example and using FileSystemEventHandler https://pythonhosted.org/watchdog/api.html#watchdog.events.FileSystemEventHandler
import sys
import time
from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer
class CustomHandler(FileSystemEventHandler):
def on_created(self, event):
print(f'File or directory name: {event.src_path}')
# do stuff
if __name__ == "__main__":
path = sys.argv[1] if len(sys.argv) > 1 else '.' # use the current directory if one is not supplied as the first argument but you could provide the path in any way you like.
event_handler = CustomHandler()
observer = Observer()
observer.schedule(event_handler, path, recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
In a more minimal form:
import time
from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer
class CustomHandler(FileSystemEventHandler):
def on_created(self, event):
print(f'File or directory name: {event.src_path}')
# do stuff
if __name__ == "__main__":
observer = Observer()
observer.schedule(CustomHandler(), '/path/to/my/directory', recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
FileSystemEventHandler has other methods apart from on_created()
you can override if you don't just want your code called when a file or directory is created.
If you are not interested in directories that have changed you can use event parameters e.g.
class CustomHandler(FileSystemEventHandler):
def on_created(self, event):
print(f'Created file or directory name: {event.src_path}')
# do stuff
def on_modified(self, event):
if not event.is_directory:
print(f'Modified file name: {event.src_path}')
# do other stuff
If you want to stop after finding the first file:
import time
from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer
class CustomHandler(FileSystemEventHandler):
def on_created(self, event):
if not event.is_directory:
print(f'File name: {event.src_path}')
# do stuff
observer.stop()
if __name__ == "__main__":
observer = Observer()
observer.schedule(CustomHandler(), '.', recursive=True)
observer.start()
try:
while observer.should_keep_running():
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
Make sure you read and understand https://pythonhosted.org/watchdog/installation.html#supported-platforms-and-caveats these caveats may apply to all solutions, not just Watchdog.