I am using celery with django for may scheduled tasks. Locally it worked. When i deployed it to Elasticbeanstalk the Celery Daemon is always connecting to rabbitmq and failing since there is no rabbitmq installed. Redis is locally installed on EB and it is up and running. The problem is it completely ignores redis broker setting no matter what i do. I tired hardcoding i tried loading env variables in conf files nothing seems to work. It would be really helpful if someone could see what i am missing.
Following are my configuration files.
celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
from django.conf import settings
# Set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'demo_django.settings')
app = Celery('app_django', broker=settings.CELERY_BROKER_URL)
# Use Django settings for Celery
app.config_from_object('django.conf:settings', namespace='CELERY')
# Auto-discover tasks from installed apps
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
settings.py
# Celery settings
CELERY_BROKER_URL = os.getenv('REDIS_URL')
CELERY_RESULT_BACKEND = os.getenv('REDIS_URL')
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
.ebextensions/01_celery_config
files:
"/etc/tmpfiles.d/celery.conf":
mode: "000777"
content: |
d /run/celery 0755 celery celery -
d /var/log/celery 0755 celery celery -
"/etc/conf.d/celery":
mode: "000777"
content: |
# Name of nodes to start
# here we have a single node
CELERYD_NODES="w1"
# or we could have three nodes:
#CELERYD_NODES="w1 w2 w3"
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/var/app/venv/staging-LQM1lest/bin/celery"
#CELERY_BIN="/virtualenvs/def/bin/celery"
# App instance to use
# comment out this line if you don't use an app
CELERY_APP="demo_django"
# or fully qualified:
#CELERY_APP="proj.tasks:app"
# How to call manage.py
CELERYD_MULTI="multi"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --autoscale=8,3"
# - %n will be replaced with the first part of the nodename.
# - %I will be replaced with the current child process index
# and is important when using the prefork pool to avoid race conditions.
CELERYD_PID_FILE="/var/run/celery/%n.pid"
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_LOG_LEVEL="INFO"
# you may wish to add these options for Celery Beat
CELERYBEAT_PID_FILE="/var/run/celery/beat.pid"
CELERYBEAT_LOG_FILE="/var/log/celery/beat.log"
# Desperate Attempt to make it work even tried hardcoding
export DJANGO_SETTINGS_MODULE="demo_django.settings"
export CELERY_BROKER_URL=redis://127.0.0.1:6379
export BROKER_URL=redis://127.0.0.1:6379
export $(/opt/elasticbeanstalk/bin/get-config --output YAML environment | sed -r 's/: /=/' | xargs)
"/etc/systemd/system/celery.service":
mode: "000777"
content: |
[Unit]
Description=Celery Service
After=network.target
[Service]
Type=forking
User=celery
Group=celery
EnvironmentFile=/etc/conf.d/celery
WorkingDirectory=/var/app/current
ExecStart=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi start $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
ExecStop=/bin/sh -c '${CELERY_BIN} multi stopwait $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}"'
ExecReload=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi restart $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
Restart=always
[Install]
WantedBy=multi-user.target
No matter what i try it always connects to rabbitmq even hardcoding is not working. I do not understand what i am missing.
It completely ignores the configuration and tries to connect to rabbitmq.
So I manage to make it work. So the fundamental problem was the celery was not using environment variables which I defined in Elastic beanstalk console which was being used in Django's settings.py. When EB deployed the application it makes the env variables available for the Application. But they were not available on any shell or for any service in this case celery.service
I set the env variables in the Enviromentfile
of celery.service
so that when the command runs to start/stop/restart celery it will have access to env variables and connect to the right broker. I have modified my /etc/conf.d/celery
where i am defining the command to pull env variables dynamically. /etc/systemd/system/celery.service
where the service is executing the Env command before the celery command.
Following is how my files look now.
/etc/conf.d/celery
# Name of nodes to start
# here we have a single node
CELERYD_NODES="w1"
# or we could have three nodes:
#CELERYD_NODES="w1 w2 w3"
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/var/app/venv/staging-LQM1lest/bin/celery"
#CELERY_BIN="/virtualenvs/def/bin/celery"
# App instance to use
# comment out this line if you don't use an app
CELERY_APP="demo_django"
# or fully qualified:
#CELERY_APP="proj.tasks:app"
# How to call manage.py
CELERYD_MULTI="multi"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --autoscale=8,3"
# - %n will be replaced with the first part of the nodename.
# - %I will be replaced with the current child process index
# and is important when using the prefork pool to avoid race conditions.
CELERYD_PID_FILE="/var/run/celery/%n.pid"
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_LOG_LEVEL="INFO"
# you may wish to add these options for Celery Beat
CELERYBEAT_PID_FILE="/var/run/celery/beat.pid"
CELERYBEAT_LOG_FILE="/var/log/celery/beat.log"
INCLUDE_ENV="export $(/opt/elasticbeanstalk/bin/get-config --output YAML environment | sed -r 's/: /=/' | xargs)"
/etc/systemd/system/celery.service
[Unit]
Description=Celery Service
After=network.target
[Service]
Type=forking
User=celery
Group=celery
EnvironmentFile=/etc/conf.d/celery
WorkingDirectory=/var/app/current
ExecStart=/bin/sh -c '${INCLUDE_ENV} && ${CELERY_BIN} -A $CELERY_APP multi start $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
ExecStop=/bin/sh -c '${CELERY_BIN} multi stopwait $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}"'
ExecReload=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi restart $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
Restart=always
[Install]
WantedBy=multi-user.target
In the first file, I used INCLUDE_ENV
variable to set the command to fetch the env variables dynamically.
In the second file, I used INCLUDE_ENV
before all three commands to the process is aware about the environments.
NOTE: The Command to fetch env variable is Elasticbeanstalk Specific. And this works on Amazon Linux 2 2023. Differnet Platform might require different command