Search code examples
pythonapscheduler

python apscheduler, an easier way to run jobs?


I have jobs scheduled thru apscheduler. I have 3 jobs so far, but soon will have many more. i'm looking for a way to scale my code.

Currently, each job is its own .py file, and in the file, I have turned the script into a function with run() as the function name. Here is my code.

from apscheduler.scheduler import Scheduler
import logging

import job1
import job2
import job3

logging.basicConfig()
sched = Scheduler()

@sched.cron_schedule(day_of_week='mon-sun', hour=7)

def runjobs():
    job1.run()
    job2.run()
    job3.run()    

sched.start()

This works, right now the code is just stupid, but it gets the job done. But when I have 50 jobs, the code will be stupid long. How do I scale it?

note: the actual names of the jobs are arbitrary and doesn't follow a pattern. The name of the file is scheduler.py and I run it using execfile('scheduler.py') in python shell.


Solution

  • import urllib
    import threading
    import datetime
    
    pages = ['http://google.com', 'http://yahoo.com', 'http://msn.com']
    
    #------------------------------------------------------------------------------
    # Getting the pages WITHOUT threads
    #------------------------------------------------------------------------------
    def job(url):
        response = urllib.urlopen(url)
        html = response.read()
    
    def runjobs():
        for page in pages:
            job(page)
    
    start = datetime.datetime.now()
    runjobs()
    end = datetime.datetime.now()
    
    print "jobs run in {} microseconds WITHOUT threads" \
          .format((end - start).microseconds)
    
    #------------------------------------------------------------------------------
    # Getting the pages WITH threads
    #------------------------------------------------------------------------------
    def job(url):
        response = urllib.urlopen(url)
        html = response.read()
    
    def runjobs():
        threads = []
        for page in pages:
            t = threading.Thread(target=job, args=(page,))
            t.start()
            threads.append(t)
    
        for t in threads:
            t.join()
    
    start = datetime.datetime.now()
    runjobs()
    end = datetime.datetime.now()
    
    print "jobs run in {} microsecond WITH threads" \
          .format((end - start).microseconds)