Search code examples
pythondockerflaskcelerycelerybeat

celery-beat KeyError: 'scheduler'


I am trying to run a periodic celery task using celery beat and docker for my Flask application. However when I run the container I get the below error:

Removing corrupted schedule file 'celerybeat-schedule': error(22, 'Invalid argument')
Traceback (most recent call last):
    File "/usr/local/lib/python3.7/site-packages/kombu/utils/objects.py", line 42, in __get__
        return obj.__dict__[self.__name__]
    KeyError: 'scheduler'

I define my beat scheduler inside my settings.py like so:

    CELERYBEAT_SCHEDULE = {
        'fetch-expensify-reports': {
            'task': 'canopact.blueprints.carbon.tasks.fetch_reports',
            'schedule': 10.0
        }
    }

This config gets passed into my create_celery_app function in my app.py file:

def create_celery_app(app=None):
    """
    Create a new Celery object and tie together the Celery config to the app's
    config. Wrap all tasks in the context of the application.

    :param app: Flask app
    :return: Celery app
    """
    app = app or create_app()

    celery = Celery(app.import_name, broker=app.config['CELERY_BROKER_URL'],
                    include=CELERY_TASK_LIST)
    celery.conf.update(app.config)
    TaskBase = celery.Task

    class ContextTask(TaskBase):
        abstract = True

        def __call__(self, *args, **kwargs):
            with app.app_context():
                return TaskBase.__call__(self, *args, **kwargs)

    celery.Task = ContextTask
    return celery

I have tried to split out the celery worker and celery beat schedule in my docker-compose.yml file like so:

   celery:
    build: .
    command: celery worker -l info -A canopact.blueprints.contact.tasks 
    env_file:
      - '.env'
    volumes:
      - '.:/canopact'

  celery_beat:
    build: .
    command: celery beat -l info -A canopact.blueprints.contact.tasks
    env_file:
    - '.env'
    volumes:
    - '.:/canopact'

However I get the same issue. I have also tried to delete my celerybeat-schedule file which seems to be corrupted as per recommendations from other posts. However upon running docker-compose up the file gets created again and the same error is thrown.

I am using celery 4.3.0. Below is the full trace back when trying to start the container.

celery_beat_1  | celery beat v4.3.0 (rhubarb) is starting.
celery_beat_1  | __    -    ... __   -        _
celery_beat_1  | LocalTime -> 2020-05-24 19:44:38
celery_beat_1  | Configuration ->
celery_beat_1  |     . broker -> redis://:**@redis:6379/0
celery_beat_1  |     . loader -> celery.loaders.app.AppLoader
celery_beat_1  |     . scheduler -> celery.beat.PersistentScheduler
celery_beat_1  |     . db -> celerybeat-schedule
celery_beat_1  |     . logfile -> [stderr]@%INFO
celery_beat_1  |     . maxinterval -> 5.00 minutes (300s)
celery_beat_1  | [2020-05-24 19:44:38,622: INFO/MainProcess] beat: Starting...
celery_beat_1  | [2020-05-24 19:44:38,696: ERROR/MainProcess] Removing corrupted schedule file 'celerybeat-schedule': error(22, 'Invalid argument')
celery_beat_1  | Traceback (most recent call last):
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/kombu/utils/objects.py", line 42, in __get__
celery_beat_1  |     return obj.__dict__[self.__name__]
celery_beat_1  | KeyError: 'scheduler'
celery_beat_1  |
celery_beat_1  | During handling of the above exception, another exception occurred:
celery_beat_1  |
celery_beat_1  | Traceback (most recent call last):
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 485, in setup_schedule
celery_beat_1  |     self._store = self._open_schedule()
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 475, in _open_schedule
celery_beat_1  |     return self.persistence.open(self.schedule_filename, writeback=True)
celery_beat_1  |   File "/usr/local/lib/python3.7/shelve.py", line 243, in open
celery_beat_1  |     return DbfilenameShelf(filename, flag, protocol, writeback)
celery_beat_1  |   File "/usr/local/lib/python3.7/shelve.py", line 227, in __init__
celery_beat_1  |     Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
celery_beat_1  |   File "/usr/local/lib/python3.7/dbm/__init__.py", line 94, in open
celery_beat_1  |     return mod.open(file, flag, mode)
celery_beat_1  | _gdbm.error: [Errno 22] Invalid argument
celery_beat_1  | [2020-05-24 19:44:38,730: CRITICAL/MainProcess] beat raised exception <class '_gdbm.error'>: error(22, 'Invalid argument')
celery_beat_1  | Traceback (most recent call last):
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/kombu/utils/objects.py", line 42, in __get__
celery_beat_1  |     return obj.__dict__[self.__name__]
celery_beat_1  | KeyError: 'scheduler'
celery_beat_1  |
celery_beat_1  | During handling of the above exception, another exception occurred:
celery_beat_1  |
celery_beat_1  | Traceback (most recent call last):
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 485, in setup_schedule
celery_beat_1  |     self._store = self._open_schedule()
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 475, in _open_schedule
celery_beat_1  |     return self.persistence.open(self.schedule_filename, writeback=True)
celery_beat_1  |   File "/usr/local/lib/python3.7/shelve.py", line 243, in open
celery_beat_1  |     return DbfilenameShelf(filename, flag, protocol, writeback)
celery_beat_1  |   File "/usr/local/lib/python3.7/shelve.py", line 227, in __init__
celery_beat_1  |     Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
celery_beat_1  |   File "/usr/local/lib/python3.7/dbm/__init__.py", line 94, in open
celery_beat_1  |     return mod.open(file, flag, mode)
celery_beat_1  | _gdbm.error: [Errno 22] Invalid argument
celery_beat_1  |
celery_beat_1  | During handling of the above exception, another exception occurred:
celery_beat_1  |
celery_beat_1  | Traceback (most recent call last):
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/apps/beat.py", line 109, in start_scheduler
celery_beat_1  |     service.start()
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 588, in start
celery_beat_1  |     humanize_seconds(self.scheduler.max_interval))
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/kombu/utils/objects.py", line 44, in __get__
celery_beat_1  |     value = obj.__dict__[self.__name__] = self.__get(obj)
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 632, in scheduler
celery_beat_1  |     return self.get_scheduler()
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 627, in get_scheduler
celery_beat_1  |     lazy=lazy,
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 467, in __init__
celery_beat_1  |     Scheduler.__init__(self, *args, **kwargs)
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 226, in __init__
celery_beat_1  |     self.setup_schedule()
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 493, in setup_schedule
celery_beat_1  |     self._store = self._destroy_open_corrupted_schedule(exc)
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 481, in _destroy_open_corrupted_schedule
celery_beat_1  |     return self._open_schedule()
celery_beat_1  |   File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 475, in _open_schedule
celery_beat_1  |     return self.persistence.open(self.schedule_filename, writeback=True)
celery_beat_1  |   File "/usr/local/lib/python3.7/shelve.py", line 243, in open
celery_beat_1  |     return DbfilenameShelf(filename, flag, protocol, writeback)
celery_beat_1  |   File "/usr/local/lib/python3.7/shelve.py", line 227, in __init__
celery_beat_1  |     Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
celery_beat_1  |   File "/usr/local/lib/python3.7/dbm/__init__.py", line 94, in open
celery_beat_1  |     return mod.open(file, flag, mode)
celery_beat_1  | _gdbm.error: [Errno 22] Invalid argument
celery_beat_1  | [2020-05-24 19:44:38,736: WARNING/MainProcess] Traceback (most recent call last):
celery_beat_1  | [2020-05-24 19:44:38,737: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/kombu/utils/objects.py", line 42, in __get__
celery_beat_1  | [2020-05-24 19:44:38,738: WARNING/MainProcess] return obj.__dict__[self.__name__]
celery_beat_1  | [2020-05-24 19:44:38,739: WARNING/MainProcess] KeyError
celery_beat_1  | [2020-05-24 19:44:38,743: WARNING/MainProcess] :
celery_beat_1  | [2020-05-24 19:44:38,744: WARNING/MainProcess] 'scheduler'
celery_beat_1  | [2020-05-24 19:44:38,745: WARNING/MainProcess] During handling of the above exception, another exception occurred:
celery_beat_1  | [2020-05-24 19:44:38,746: WARNING/MainProcess] Traceback (most recent call last):
celery_beat_1  | [2020-05-24 19:44:38,747: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 485, in setup_schedule
celery_beat_1  | [2020-05-24 19:44:38,749: WARNING/MainProcess] self._store = self._open_schedule()
celery_beat_1  | [2020-05-24 19:44:38,751: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 475, in _open_schedule
celery_beat_1  | [2020-05-24 19:44:38,756: WARNING/MainProcess] return self.persistence.open(self.schedule_filename, writeback=True)
celery_beat_1  | [2020-05-24 19:44:38,757: WARNING/MainProcess] File "/usr/local/lib/python3.7/shelve.py", line 243, in open
celery_beat_1  | [2020-05-24 19:44:38,759: WARNING/MainProcess] return DbfilenameShelf(filename, flag, protocol, writeback)
celery_beat_1  | [2020-05-24 19:44:38,760: WARNING/MainProcess] File "/usr/local/lib/python3.7/shelve.py", line 227, in __init__
celery_beat_1  | [2020-05-24 19:44:38,761: WARNING/MainProcess] Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
celery_beat_1  | [2020-05-24 19:44:38,762: WARNING/MainProcess] File "/usr/local/lib/python3.7/dbm/__init__.py", line 94, in open
celery_beat_1  | [2020-05-24 19:44:38,764: WARNING/MainProcess] return mod.open(file, flag, mode)
celery_beat_1  | [2020-05-24 19:44:38,770: WARNING/MainProcess] _gdbm
celery_beat_1  | [2020-05-24 19:44:38,772: WARNING/MainProcess] .
celery_beat_1  | [2020-05-24 19:44:38,774: WARNING/MainProcess] error
celery_beat_1  | [2020-05-24 19:44:38,776: WARNING/MainProcess] :
celery_beat_1  | [2020-05-24 19:44:38,777: WARNING/MainProcess] [Errno 22] Invalid argument
celery_beat_1  | [2020-05-24 19:44:38,778: WARNING/MainProcess] During handling of the above exception, another exception occurred:
celery_beat_1  | [2020-05-24 19:44:38,779: WARNING/MainProcess] Traceback (most recent call last):
celery_beat_1  | [2020-05-24 19:44:38,779: WARNING/MainProcess] File "/usr/local/bin/celery", line 8, in <module>
celery_beat_1  | [2020-05-24 19:44:38,780: WARNING/MainProcess] sys.exit(main())
celery_beat_1  | [2020-05-24 19:44:38,782: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/__main__.py", line 16, in main
celery_beat_1  | [2020-05-24 19:44:38,783: WARNING/MainProcess] _main()
celery_beat_1  | [2020-05-24 19:44:38,785: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/bin/celery.py", line 322, in main
celery_beat_1  | [2020-05-24 19:44:38,787: WARNING/MainProcess] cmd.execute_from_commandline(argv)
celery_beat_1  | [2020-05-24 19:44:38,788: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/bin/celery.py", line 496, in execute_from_commandline
celery_beat_1  | [2020-05-24 19:44:38,795: WARNING/MainProcess] super(CeleryCommand, self).execute_from_commandline(argv)))
celery_beat_1  | [2020-05-24 19:44:38,796: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/bin/base.py", line 298, in execute_from_commandline
celery_beat_1  | [2020-05-24 19:44:38,797: WARNING/MainProcess] return self.handle_argv(self.prog_name, argv[1:])
celery_beat_1  | [2020-05-24 19:44:38,798: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/bin/celery.py", line 488, in handle_argv
celery_beat_1  | [2020-05-24 19:44:38,801: WARNING/MainProcess] return self.execute(command, argv)
celery_beat_1  | [2020-05-24 19:44:38,803: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/bin/celery.py", line 420, in execute
celery_beat_1  | [2020-05-24 19:44:38,809: WARNING/MainProcess] ).run_from_argv(self.prog_name, argv[1:], command=argv[0])
celery_beat_1  | [2020-05-24 19:44:38,810: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/bin/base.py", line 302, in run_from_argv
celery_beat_1  | [2020-05-24 19:44:38,812: WARNING/MainProcess] sys.argv if argv is None else argv, command)
celery_beat_1  | [2020-05-24 19:44:38,813: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/bin/base.py", line 386, in handle_argv
celery_beat_1  | [2020-05-24 19:44:38,818: WARNING/MainProcess] return self(*args, **options)
celery_beat_1  | [2020-05-24 19:44:38,821: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/bin/base.py", line 252, in __call__
celery_beat_1  | [2020-05-24 19:44:38,827: WARNING/MainProcess] ret = self.run(*args, **kwargs)
celery_beat_1  | [2020-05-24 19:44:38,827: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/bin/beat.py", line 109, in run
celery_beat_1  | [2020-05-24 19:44:38,830: WARNING/MainProcess] return beat().run()
celery_beat_1  | [2020-05-24 19:44:38,830: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/apps/beat.py", line 81, in run
celery_beat_1  | [2020-05-24 19:44:38,832: WARNING/MainProcess] self.start_scheduler()
celery_beat_1  | [2020-05-24 19:44:38,833: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/apps/beat.py", line 109, in start_scheduler
celery_beat_1  | [2020-05-24 19:44:38,834: WARNING/MainProcess] service.start()
celery_beat_1  | [2020-05-24 19:44:38,836: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 588, in start
celery_beat_1  | [2020-05-24 19:44:38,838: WARNING/MainProcess] humanize_seconds(self.scheduler.max_interval))
celery_beat_1  | [2020-05-24 19:44:38,841: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/kombu/utils/objects.py", line 44, in __get__
celery_beat_1  | [2020-05-24 19:44:38,843: WARNING/MainProcess] value = obj.__dict__[self.__name__] = self.__get(obj)
celery_beat_1  | [2020-05-24 19:44:38,844: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 632, in scheduler
celery_beat_1  | [2020-05-24 19:44:38,849: WARNING/MainProcess] return self.get_scheduler()
celery_beat_1  | [2020-05-24 19:44:38,850: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 627, in get_scheduler
celery_beat_1  | [2020-05-24 19:44:38,852: WARNING/MainProcess] lazy=lazy,
celery_beat_1  | [2020-05-24 19:44:38,852: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 467, in __init__
celery_beat_1  | [2020-05-24 19:44:38,855: WARNING/MainProcess] Scheduler.__init__(self, *args, **kwargs)
celery_beat_1  | [2020-05-24 19:44:38,859: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 226, in __init__
celery_beat_1  | [2020-05-24 19:44:38,864: WARNING/MainProcess] self.setup_schedule()
celery_beat_1  | [2020-05-24 19:44:38,865: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 493, in setup_schedule
celery_beat_1  | [2020-05-24 19:44:38,867: WARNING/MainProcess] self._store = self._destroy_open_corrupted_schedule(exc)
celery_beat_1  | [2020-05-24 19:44:38,868: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 481, in _destroy_open_corrupted_schedule
celery_beat_1  | [2020-05-24 19:44:38,873: WARNING/MainProcess] return self._open_schedule()
celery_beat_1  | [2020-05-24 19:44:38,874: WARNING/MainProcess] File "/usr/local/lib/python3.7/site-packages/celery/beat.py", line 475, in _open_schedule
celery_beat_1  | [2020-05-24 19:44:38,876: WARNING/MainProcess] return self.persistence.open(self.schedule_filename, writeback=True)
celery_beat_1  | [2020-05-24 19:44:38,879: WARNING/MainProcess] File "/usr/local/lib/python3.7/shelve.py", line 243, in open
celery_beat_1  | [2020-05-24 19:44:38,884: WARNING/MainProcess] return DbfilenameShelf(filename, flag, protocol, writeback)
celery_beat_1  | [2020-05-24 19:44:38,885: WARNING/MainProcess] File "/usr/local/lib/python3.7/shelve.py", line 227, in __init__
celery_beat_1  | [2020-05-24 19:44:38,886: WARNING/MainProcess] Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
celery_beat_1  | [2020-05-24 19:44:38,887: WARNING/MainProcess] File "/usr/local/lib/python3.7/dbm/__init__.py", line 94, in open
celery_beat_1  | [2020-05-24 19:44:38,889: WARNING/MainProcess] return mod.open(file, flag, mode)
celery_beat_1  | [2020-05-24 19:44:38,890: WARNING/MainProcess] _gdbm
celery_beat_1  | [2020-05-24 19:44:38,892: WARNING/MainProcess] .
celery_beat_1  | [2020-05-24 19:44:38,896: WARNING/MainProcess] error
celery_beat_1  | [2020-05-24 19:44:38,898: WARNING/MainProcess] :

Solution

  • This is weird, I haven't got the solution right now, but I found a way to circumnavigate this.

    Why we are getting the issue :

    Here are some thoughts on celery docs which explains what is happening here :

    Beat needs to store the last run times of the tasks in a local database file (named celerybeat-schedule by default), so it needs access to write in the current directory, or alternatively you can specify a custom location for this file:

    Basicaly celery is trying to read the file named celerybeat-schedule but form some reason it's failing.

    Why is it failing to read it on docker?

    I have no clue for now...

    However this comment on give some lights

    It's something related to files storage.

    Here is my workaround.

    I decided to use Redis as scheduler run times of my task instead of file storage and luckily I found this package which helped me to achieve that.

    What you can do is this:

    Update your celery app config using :

    app.conf.redbeat_redis_url = your redis url
    

    Then in your docker file you need to tell celery which scheduler it should use.

    celery:
        build: .
        command: celery worker -l info -A canopact.blueprints.contact.tasks 
        env_file:
          - '.env'
        volumes:
          - '.:/canopact'
    
      celery_beat:
        build: .
        command: celery beat -l info -A canopact.blueprints.contact.tasks -S redbeat.RedBeatScheduler
        env_file:
        - '.env'
        volumes:
        - '.:/canopact'