I'm trying to monitor a CSV file that is being written to by a separate program. Around every 10 seconds, the CSV file is updated with a couple more lines. Each time the file is updated, I want to be able to detect the file has been changed (will always be the same file), take the new lines, and write them to console (just for a test).
I have looked around the website, and have found numerous ways of watching a file to see if its updated (like so http://thepythoncorner.com/dev/how-to-create-a-watchdog-in-python-to-look-for-filesystem-changes/), but I can't seem to find anything that will allow me to get to the changes made in the file to print out to console.
Current code:
import time
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
def on_created(event):
print(f"hey, {event.src_path} has been created!")
def on_deleted(event):
print(f"Someone deleted {event.src_path}!")
def on_modified(event):
print(f"{event.src_path} has been modified")
def on_moved(event):
print(f"ok ok ok, someone moved {event.src_path} to {event.dest_path}")
if __name__ == "__main__":
patterns = "*"
ignore_patterns = ""
ignore_directories = False
case_sensitive = True
my_event_handler = PatternMatchingEventHandler(patterns, ignore_patterns, ignore_directories, case_sensitive)
my_event_handler.on_created = on_created
my_event_handler.on_deleted = on_deleted
my_event_handler.on_modified = on_modified
my_event_handler.on_moved = on_moved
path = "."
go_recursively = True
my_observer = Observer()
my_observer.schedule(my_event_handler, path, recursive=go_recursively)
my_observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
my_observer.stop()
my_observer.join()
This runs, but looks for changes in files all over the place. How do I make it listen for changes from one single file?
If you're more or less happy with the script other than it tracking a bunch of files then you could change the patterns = "*"
part which is a wildcard matching string which tells the PatternMatchingEventHandler
to look for any file. You could change that to paterns = 'my_file.csv'
and also change the path
variable to the directory that the file is in to save some time recursively scanning all the directories in '.'
. Then you don't need recursive
set to True
for a single file either.
Print new lines to console part (one option):
import pandas as pd
...
def on_modified(event):
print(f"{event.src_path} has been modified")
# You said "a couple more lines" I'm going to take that
# as two:
df = pd.read_csv(event.src_path)
print("Newest 2 lines:")
print(df[-2:])
If it's not two lines you'll want to track the length of the file and pass that to the function which opens the CSV so it knows how many lines are new.