I have a CSV File (ZN_15M
) that I'm trying to use read_csv
function on hourly. So I have APScheduler installed and am trying to use it to read the CSV file every hour (and some other stuff not shown but if I can get the read_csv
stuff going the other stuff will work too):
import sys
from time import sleep
from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
scheduler.start()
def Run():
f2 = open('C:\Users\cost9\OneDrive\Documents\PYTHON\Exported_Data\ZN_ES\ZN_15M.csv')
ZN = pd.read_csv(f2)
#Do stuff to the CSV File/DataFrame
ZN.tocsv(path_or_buf = 'path')
def main():
job = scheduler.add_interval_job(Run, minutes=60, args=())
while True:
sleep(60)
sys.stdout.write('.'); sys.stdout.flush()
I don't get any errors when I manually run the script, but nothing is running hourly like I'd like. Not sure what I'm doing wrong here...
Update: I'm getting an error below:
def process_csv(path_to_csv):
ZN_ES_comb = pd.read_csv(path_to_csv)
# Insert your CSV processing here
ZN_ES_comb = pd.DataFrame(ZN_ES_comb)
ZN_ES_comb.to_csv(path_to_csv.replace('.csv', '_modified_{timestamp}.csv').format(
timestamp=time.strftime("%Y%m%d-%H%M%S")), index=False)
if __name__ == '__main__':
# Create CSV for demonstrating purposes
path_to_csv = 'C:\Users\cost9\OneDrive\Documents\PYTHON\Daily Tasks\ZN_ES\ZN_ES_15M\CSV\ZN_ES_comb.csv'
pd.DataFrame(ZN_ES_comb).to_csv(path_to_csv, index=False)
# Start scheduler
scheduler = BackgroundScheduler()
scheduler.start()
scheduler.add_job(func=process_csv,
args=[path_to_csv],
trigger=IntervalTrigger(seconds=2))
# Wait for 7 seconds so that scheduler can call process_csv 3 times
time.sleep(7)
The error is for the line pd.DataFrame(ZN_ES_comb).to_csv(path_to_csv, index=False)
- it says:
NameError: name 'ZN_ES_comb' is not defined
There are two issues in your code:
ZN.to_csv()
instead of ZN.tocsv()
in def Run()
.time.sleep()
is measured in seconds, not in minutes like you apparently have thought. Thus, during the sleeping Run()
was not ran at all.In the following there is a working solution that works with Python 3.5 and APScheduler 3.3.1. IntervalTrigger()
has also hours
parameter which you might wanna use instead of seconds
.
import time
import pandas as pd
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.triggers.interval import IntervalTrigger
def process_csv(path_to_csv):
df = pd.read_csv(path_to_csv)
# Insert your CSV processing here
df.to_csv(path_to_csv.replace('.csv', '_modified_{timestamp}.csv').format(
timestamp=time.strftime("%Y%m%d-%H%M%S")), index=False)
if __name__ == '__main__':
# Create CSV for demonstrating purposes
path_to_csv = 'made_up.csv'
pd.DataFrame({'fruit': ['apple', 'banana'],
'number': [1, 2]}).to_csv(path_to_csv, index=False)
# Start scheduler
scheduler = BackgroundScheduler()
scheduler.start()
scheduler.add_job(func=process_csv,
args=[path_to_csv],
trigger=IntervalTrigger(seconds=2))
# Wait for 7 seconds so that scheduler can call process_csv 3 times
time.sleep(7)