Search code examples
djangoherokuamazon-s3celerydjango-media

django on heroku: celery worker gets 403 forbidden when accessing s3 media to read and process media files


I'm really stuck on this one because I'm not sure where to start:

My Django project allows users to upload a spreadsheet and the app then processes and aggregates the uploaded data.

The file is uploaded to the MEDIA_URL using a standard form and Django model with a FileField. Once it's uploaded a celery worker accesses the file and processes it, writing the output to another model.

This works fine locally, but is not working in production. I'm deploying to heroku, and using the cookiecutter-django project template. I've set up an s3 bucket and am using the django-storages library.

The files upload without a problem - I can access and delete them in the Django admin, and also in the s3 bucket.

However when the celery worker tries to read the file, I get an HTTP Error 403: Forbidden. I'm not sure how to approach this problem, because I am not sure which part of the stack contains my mistake. Could it be my tasks.py module, heroku:redis addon, or settings.py module?


Solution

  • It's necessary to tell celery where to get its configuration from (which settings file to use). I wasn't updating the config to production settings when deploying.

    This is my fixed celery_app.py

    import os
    
    from celery import Celery
    
    # set the default Django settings module for the 'celery' program.
    os.environ.setdefault("DJANGO_SETTINGS_MODULE", "config.settings.production")
    
    app = Celery("<project_name>")
    
    # Using a string here means the worker doesn't have to serialize
    # the configuration object to child processes.
    # - namespace='CELERY' means all celery-related configuration keys
    #   should have a `CELERY_` prefix.
    app.config_from_object("django.conf:settings", namespace="CELERY")
    
    # Load task modules from all registered Django app configs.
    app.autodiscover_tasks()