Search code examples
python-3.xamazon-elastic-beanstalkcelerydjango-celeryceleryd

Django-Celery Daemon is unable to connect to Redis on Elasticbeanstalk


I am using celery with django for may scheduled tasks. Locally it worked. When i deployed it to Elasticbeanstalk the Celery Daemon is always connecting to rabbitmq and failing since there is no rabbitmq installed. Redis is locally installed on EB and it is up and running. The problem is it completely ignores redis broker setting no matter what i do. I tired hardcoding i tried loading env variables in conf files nothing seems to work. It would be really helpful if someone could see what i am missing.

Following are my configuration files.

celery.py

from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
from django.conf import settings

# Set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'demo_django.settings')

app = Celery('app_django', broker=settings.CELERY_BROKER_URL)

# Use Django settings for Celery
app.config_from_object('django.conf:settings', namespace='CELERY')

# Auto-discover tasks from installed apps
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)

settings.py

# Celery settings
CELERY_BROKER_URL = os.getenv('REDIS_URL')
CELERY_RESULT_BACKEND = os.getenv('REDIS_URL')
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'

.ebextensions/01_celery_config

files:
  "/etc/tmpfiles.d/celery.conf":
    mode: "000777"
    content: |
      d /run/celery 0755 celery celery -
      d /var/log/celery 0755 celery celery -
  "/etc/conf.d/celery":
    mode: "000777"
    content: |
      # Name of nodes to start
      # here we have a single node
      CELERYD_NODES="w1"
      # or we could have three nodes:
      #CELERYD_NODES="w1 w2 w3"

      # Absolute or relative path to the 'celery' command:
      CELERY_BIN="/var/app/venv/staging-LQM1lest/bin/celery"
      #CELERY_BIN="/virtualenvs/def/bin/celery"

      # App instance to use
      # comment out this line if you don't use an app
      CELERY_APP="demo_django"
      # or fully qualified:
      #CELERY_APP="proj.tasks:app"

      # How to call manage.py
      CELERYD_MULTI="multi"

      # Extra command-line arguments to the worker
      CELERYD_OPTS="--time-limit=300 --autoscale=8,3"

      # - %n will be replaced with the first part of the nodename.
      # - %I will be replaced with the current child process index
      #   and is important when using the prefork pool to avoid race conditions.
      CELERYD_PID_FILE="/var/run/celery/%n.pid"
      CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
      CELERYD_LOG_LEVEL="INFO"

      # you may wish to add these options for Celery Beat
      CELERYBEAT_PID_FILE="/var/run/celery/beat.pid"
      CELERYBEAT_LOG_FILE="/var/log/celery/beat.log"
      
      # Desperate Attempt to make it work even tried hardcoding
      export DJANGO_SETTINGS_MODULE="demo_django.settings" 
      export CELERY_BROKER_URL=redis://127.0.0.1:6379
      export BROKER_URL=redis://127.0.0.1:6379
      export $(/opt/elasticbeanstalk/bin/get-config --output YAML environment | sed -r 's/: /=/' | xargs)

  "/etc/systemd/system/celery.service":
    mode: "000777"
    content: |
      [Unit]
      Description=Celery Service
      After=network.target

      [Service]
      Type=forking
      User=celery
      Group=celery
      EnvironmentFile=/etc/conf.d/celery
      WorkingDirectory=/var/app/current
      ExecStart=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi start $CELERYD_NODES \
          --pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
          --loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
      ExecStop=/bin/sh -c '${CELERY_BIN} multi stopwait $CELERYD_NODES \
          --pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
          --loglevel="${CELERYD_LOG_LEVEL}"'
      ExecReload=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi restart $CELERYD_NODES \
          --pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
          --loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
      Restart=always

      [Install]
      WantedBy=multi-user.target

No matter what i try it always connects to rabbitmq even hardcoding is not working. I do not understand what i am missing.

  1. I exported variable in the conf file
  2. I replaced localhost with 127.0.0.1 since some stackoverflow answers pointed that out.
  3. I tried changing the env variables to lower-case

It completely ignores the configuration and tries to connect to rabbitmq.


Solution

  • So I manage to make it work. So the fundamental problem was the celery was not using environment variables which I defined in Elastic beanstalk console which was being used in Django's settings.py. When EB deployed the application it makes the env variables available for the Application. But they were not available on any shell or for any service in this case celery.service

    I set the env variables in the Enviromentfile of celery.service so that when the command runs to start/stop/restart celery it will have access to env variables and connect to the right broker. I have modified my /etc/conf.d/celery where i am defining the command to pull env variables dynamically. /etc/systemd/system/celery.service where the service is executing the Env command before the celery command.

    Following is how my files look now.

    /etc/conf.d/celery

      # Name of nodes to start
      # here we have a single node
      CELERYD_NODES="w1"
      # or we could have three nodes:
      #CELERYD_NODES="w1 w2 w3"
    
      # Absolute or relative path to the 'celery' command:
      CELERY_BIN="/var/app/venv/staging-LQM1lest/bin/celery"
      #CELERY_BIN="/virtualenvs/def/bin/celery"
    
      # App instance to use
      # comment out this line if you don't use an app
      CELERY_APP="demo_django"
      # or fully qualified:
      #CELERY_APP="proj.tasks:app"
    
      # How to call manage.py
      CELERYD_MULTI="multi"
    
      # Extra command-line arguments to the worker
      CELERYD_OPTS="--time-limit=300 --autoscale=8,3"
    
      # - %n will be replaced with the first part of the nodename.
      # - %I will be replaced with the current child process index
      #   and is important when using the prefork pool to avoid race conditions.
      CELERYD_PID_FILE="/var/run/celery/%n.pid"
      CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
      CELERYD_LOG_LEVEL="INFO"
    
      # you may wish to add these options for Celery Beat
      CELERYBEAT_PID_FILE="/var/run/celery/beat.pid"
      CELERYBEAT_LOG_FILE="/var/log/celery/beat.log"
      INCLUDE_ENV="export $(/opt/elasticbeanstalk/bin/get-config --output YAML environment | sed -r 's/: /=/' | xargs)"
    

    /etc/systemd/system/celery.service

      [Unit]
      Description=Celery Service
      After=network.target
    
      [Service]
      Type=forking
      User=celery
      Group=celery
      EnvironmentFile=/etc/conf.d/celery
      WorkingDirectory=/var/app/current
      ExecStart=/bin/sh -c '${INCLUDE_ENV} && ${CELERY_BIN} -A $CELERY_APP multi start $CELERYD_NODES \
          --pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
          --loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
      ExecStop=/bin/sh -c '${CELERY_BIN} multi stopwait $CELERYD_NODES \
          --pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
          --loglevel="${CELERYD_LOG_LEVEL}"'
      ExecReload=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi restart $CELERYD_NODES \
          --pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
          --loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
      Restart=always
    
      [Install]
      WantedBy=multi-user.target
    

    In the first file, I used INCLUDE_ENV variable to set the command to fetch the env variables dynamically. In the second file, I used INCLUDE_ENV before all three commands to the process is aware about the environments.

    NOTE: The Command to fetch env variable is Elasticbeanstalk Specific. And this works on Amazon Linux 2 2023. Differnet Platform might require different command