Search code examples
pythonmysqlapscheduler

How to save python Apscheduler using SQLALchemyJobStore


Recently, I'm working on making personal news aggregator on localhost server using python. I want to get RSS feed automatically from many different sites at specific time every day. I searched for this on the internet and i found out there is Apscheduler library. To avoid creating new scheduler every time i turn on computer, i think it is better to save the scheduler in my Mysql database using SQLALchemyJobStore.

There is a lot of information related to creating scheduler and configuration at here but i can't find any mention about storage and loading. Assuming the code below is the scheduler that I want to create, How can i save and load the scheduler variable in my database?

from pytz import utc

from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.jobstores.mongodb import MongoDBJobStore
from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor


jobstores = {
    'mongo': MongoDBJobStore(),
    'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')
}
executors = {
    'default': ThreadPoolExecutor(20),
    'processpool': ProcessPoolExecutor(5)
}
job_defaults = {
    'coalesce': False,
    'max_instances': 3
}
scheduler = BackgroundScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults, timezone=utc)

Solution

  • as @Sraw mentioned you do not need to worry about how to load jobs it will do it for you. You forgot adding one line at the end which is

    scheduler.start()
    

    Just do not forget to add job_store when you are adding jobs.

    scheduler.add_job(jobstore='mongo', trigger='cron', minute=8)
    

    In my case I am using just

    jobstores = {
        'mongo': MongoDBJobStore()
    }
    

    and it creates 'apscheduler' database in my mongo with collection 'jobs'. You can go and check manually by using mongo shell if jobs are being loaded. You can also use print_jobs or get_jobs() of scheduler

    scheduler.print_jobs(jobstore='mongo') #mongo in case only
    jobs = scheduler.get_jobs(jobstore='mongo') #mongo in my case only
    

    Eventually your code will look like

    from pytz import utc
    
    from apscheduler.schedulers.background import BackgroundScheduler
    from apscheduler.jobstores.mongodb import MongoDBJobStore
    from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
    from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
    
    
    jobstores = {
      'mongo': MongoDBJobStore(),
      'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')
    }
    executors = {
      'default': ThreadPoolExecutor(20),
      'processpool': ProcessPoolExecutor(5)
    }
    job_defaults = {
      'coalesce': False,
      'max_instances': 3
    }
    scheduler = BackgroundScheduler(jobstores=jobstores, 
    executors=executors, job_defaults=job_defaults, timezone=utc)
    scheduler.start()