Search code examples
djangorediscelerydigital-oceandjango-celery

Django with Celery on Digital Ocean


The Objective

I am trying to use Celery in combination with Django; The objective is to set up Celery on a Django web application (deployed test environment) to send scheduled emails. The web application already sends emails. The ultimate objective is to add functionality to send out emails at a user-selected date-time. However, before we get there the first step is to invoke the delay() function to prove that Celery is working.

Tutorials and Documentation Used

I am new to Celery and have been learning through the following resources:

I have spent several days reviewing existing Stack Overflow questions on Django/Celery, and tried a number of suggestions. However, I have not found a question specifically describing this effect in the Django/Celery/Redis/Digital Ocean context. Below is described the current situation.

What Is Currently Happening?

The current outcome, as of this post, is that the web application times out, suggesting that the Django app is not successfully connecting with the Celery to send the email. Please note that towards the bottom of the post is the output of the Celery worker being successfully started manually from within the Django app's console, including a listing of the expected tasks.

The Stack In Use

  • Python 3.11 and Django 4.1.6: Running on the Digital Ocean App platform
  • Celery 5.2.7 and Redis 4.4.2 on Ubuntu 20.04: Running on a separate Digital Ocean Droplet

The Django project name is, "Whurthy".

Celery Setup Code Snippets

The following snippets are primarily from the Celery-Django documentation: https://docs.celeryq.dev/en/stable/django/first-steps-with-django.html#using-celery-with-django

Whurthy/celery.py

import os
from celery import Celery

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'Whurthy.settings')

app = Celery('Whurthy')
app.config_from_object('django.conf:settings', namespace='CELERY')

app.autodiscover_tasks()


@app.task(bind=True)
def debug_task(self):
    print(f'Request: {self.request!r}')

Whurthy/__init__.py

from .celery import app as celery_app

__all__ = ('celery_app',)

Application Specific Code Snippets

Whurthy/settings.py

CELERY_BROKER_URL = 'redis://SNIP_FOR_PRIVACY:6379'
CELERY_RESULT_BACKEND = 'redis://SNIP_FOR_PRIVACY:6379'
CELERY_TASK_TRACK_STARTED = True
CELERY_TASK_TIME_LIMIT = 30 * 60
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = TIME_ZONE

I have replaced the actual IP with the string SNIP_FOR_PRIVACY for obvious reasons. However, if this were incorrect I would not get the output below.

I have also commented out the bind and requirepass redis configuration settings to support troubleshooting during development. This makes the URL as simple as possible and rules out either the incoming IP or password as being the cause of this problem.

'events/tasks.py`

from celery import shared_task
from django.core.mail import send_mail


@shared_task
def send_email_task():
    send_mail(
        'Celery Task Worked!',
        'This is proof the task worked!',
        'notifications@domain.com',
        ['my_email@domain.com'],
    )
    return

For privacy reasons I have changed the to and from email addresses. However, please note that this function works before adding .delay() to the following snippet. In other words, the Django app sends an email up until I add .delay() to invoke Celery.

events/views.py (extract)

from .tasks import send_email_task
from django.shortcuts import render

def home(request):
    send_email_task.delay()
    return render(request, 'home.html', context)

The above is just the relevant extract of a larger file to show the specific line of code calling the function. The Django web application is working until delay() is appended to the function call, and so I have not included other Django project file snippets.

Output from Running celery -A Whurthy worker -l info in the Digital Ocean Django App Console

Ultimately, I want to Dockerize this command, but for now I am running the above command manually. Below is the output within the Django App console, and it appears consistent with the tutorial and other examples of what a successfully configured Celery instance would look like.

<SNIP>
 -------------- celery@whurthy-staging-b8bb94b5-xp62x v5.2.7 (dawn-chorus)
--- ***** ----- 
-- ******* ---- Linux-4.4.0-x86_64-with-glibc2.31 2023-02-05 11:51:24
- *** --- * --- 
- ** ---------- [config]
- ** ---------- .> app:         Whurthy:0x7f92e54191b0
- ** ---------- .> transport:   redis://SNIP_FOR_PRIVACY:6379//
- ** ---------- .> results:     redis://SNIP_FOR_PRIVACY:6379/
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- 
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery
                

[tasks]
  . Whurthy.celery.debug_task
  . events.tasks.send_email_task

This appears to confirm that the Digital Ocean droplet is starting up a Celery worker successfully (suggesting that the code snippets above are correct) and that Redis configuration is correct. The two tasks listed when starting Celery is consistent with expectations. However, I am clearly missing something, and cannot rule out that the way Digital Ocean runs droplets is getting in the way.

The baseline test is that the web application sends out an email through the function call. However, as soon as I add .delay() the web page request times out.

I have endeavoured to replicate all that is relevant. I welcome any suggestions to resolve this issue or constructive criticism to improve this question.

Troubleshooting Attempts

Attempt 1

  • Through the D.O. app console I ran python manage.py shell
  • I then entered the following into the shell:
>>> from events.tasks import send_email_task
>>> send_email_task
<@task: events.tasks.send_email_task of Whurthy at 0x7fb2f2348dc0>
>>> send_email_task.delay()

At this point the shell hangs/does not respond until I keyboard interrupt.

I then tried the following:

>>> send_email_task.apply()
<EagerResult: 90b7d92c-4f01-423b-a16f-f7a7c75a545c>

AND, the task sends an email!

So, the connection between Django-Redis-Celery appears to work. However, invoking delay() causes the web app to time out and the email to NOT be sent.

So either delay() isn't putting the task in the queue, or is getting stuck. But in either case, this does not appear to be a connection issue. However, because apply() runs the code in the thread of the caller this isn't resolving the issue.

Which does suggest this may be an issue with the broker. This in turn may be an issue with settings...

Made minor changes to broker settings in settings.py

CELERY_BROKER_URL = 'redis://SNIP_FOR_PRIVACY:6379/0'
CELERY_RESULT_BACKEND = 'redis://SNIP_FOR_PRIVACY:6379/1'

delay() still hangs in the shell.

Attempt 2

I discovered that in Digital Ocean the ipv4 does not work when used for the Broker URL. By replacing that with the private IP in the CELERY_BROKER_URL setting I was able to get delay() working within the Django app's shell.

However, while I can now get delay() working in the shell returning to the original objective still fails. In other words, when loading in the respective view the web application hangs.

I am currently researching other approaches. Any suggestions are welcome. Given that I can now get Celery to work through the broker in the shell but not in the web application I feel like I have made some progress but am still out of a solution.

As a side note, I am also trying to make this connection through a Digital Ocean Managed Redis DB, although that is presenting a completely different issue.


Solution

  • Ultimately, the answer I uncovered is a compromise, a workaround using a different Digital Ocean (D.O.) product. The workaround was to use a Managed Database (which simplifies things but gives you much less control) rather than a Droplet (which involves manual Linux/Redis installation and configuration, but gives you greater control). This isn't ideal for 2 reasons. First, it costs more ($6 vs $15 base cost). Second, I would have preferred to be able to work out how to manually setup Redis (and thus maintain greater control). However, I'll take a working solution over no solution for a very niche issue.

    The steps to use a D.O. Managed Redis DB are:

    • Provision the managed Redis DB
    • Use the Public Network Connection String (as the connection string includes the password I store this in an environment variable)
    • Ensure that you have the appropriate ssl setting in the 'celery.py' file (snippet below)

    celery.py

    import os
    
    from celery import Celery
    from ssl import CERT_NONE
    
    os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj_name.settings')
    
    app = Celery(
        'proj_name',
        broker_use_ssl={'ssl_cert_reqs': ssl.CERT_NONE},
        redis_backend_use_ssl={'ssl_cert_reqs': ssl.CERT_NONE}
    )
    
    
    app.config_from_object('django.conf:settings', namespace='CELERY')
    
    app.autodiscover_tasks()
    
    
    @app.task(bind=True)
    def debug_task(self):
        print(f'Request: {self.request!r}')
    

    settings.py

    REDIS_URI = os.environ.get('REDIS_URI')
    CELERY_BROKER_URL = f'{REDIS_URI}/0'
    CELERY_RESULT_BACKEND = f'{REDIS_URI}/1'
    CELERY_TASK_TRACK_STARTED = True
    CELERY_TASK_TIME_LIMIT = 30 * 60
    CELERY_ACCEPT_CONTENT = ['json']
    CELERY_TASK_SERIALIZER = 'json'
    CELERY_RESULT_SERIALIZER = 'json'
    CELERY_TIMEZONE = TIME_ZONE