Search code examples
pythoncelerysupervisordceleryd

Celeryd multi with supervisord


Trying to run supervisord (3.2.2) with celery multi.

Seems to be that supervisord can't handle it. Single celery worker works fine.

This is my supervisord configuration

celery multi v3.1.20 (Cipater)
> Starting nodes...
    > celery1@parzee-dev-app-sfo1: OK
Stale pidfile exists. Removing it.
    > celery2@parzee-dev-app-sfo1: OK
Stale pidfile exists. Removing it.

celeryd.conf

; ==================================
;  celery worker supervisor example
; ==================================

[program:celery]
; Set full path to celery program if using virtualenv
command=/usr/local/src/imbue/application/imbue/supervisorctl/celeryd/celeryd.sh
process_name = %(program_name)s%(process_num)d@%(host_node_name)s
directory=/usr/local/src/imbue/application/imbue/conf/
numprocs=2
stderr_logfile=/usr/local/src/imbue/application/imbue/log/celeryd.err
logfile=/usr/local/src/imbue/application/imbue/log/celeryd.log
stdout_logfile_backups = 10
stderr_logfile_backups = 10
stdout_logfile_maxbytes = 50MB
stderr_logfile_maxbytes = 50MB
autostart=true
autorestart=false
startsecs=10

Im using the following supervisord variables to emulate the way I start celery:

  • %(program_name)s
  • %(process_num)d
  • @
  • %(host_node_name)s

Supervisorctl

supervisorctl 
celery:celery1@parzee-dev-app-sfo1   FATAL     Exited too quickly (process log may have details)
celery:celery2@parzee-dev-app-sfo1   FATAL     Exited too quickly (process log may have details)

I tried changing this value in /usr/local/lib/python2.7/dist-packages/supervisor/options.py from 0 to 1:

numprocs_start = integer(get(section, 'numprocs_start', 1))

I still get:

celery:celery1@parzee-dev-app-sfo1   FATAL     Exited too quickly (process log may have details)
celery:celery2@parzee-dev-app-sfo1   EXITED    May 14 12:47 AM

Celery is starting but supervisord is not keeping track of it.

root@parzee-dev-app-sfo1:/etc/supervisor#

ps -ef | grep celery
root      2728     1  1 00:46 ?        00:00:02 [celeryd: celery1@parzee-dev-app-sfo1:MainProcess] -active- (worker -c 16 -n celery1@parzee-dev-app-sfo1 --loglevel=DEBUG -P processes --logfile=/usr/local/src/imbue/application/imbue/log/celeryd.log --pidfile=/usr/local/src/imbue/application/imbue/log/1.pid)
root      2973     1  1 00:46 ?        00:00:02 [celeryd: celery2@parzee-dev-app-sfo1:MainProcess] -active- (worker -c 16 -n celery2@parzee-dev-app-sfo1 --loglevel=DEBUG -P processes --logfile=/usr/local/src/imbue/application/imbue/log/celeryd.log --pidfile=/usr/local/src/imbue/application/imbue/log/2.pid)

celery.sh

source ~/.profile
CELERY_LOGFILE=/usr/local/src/imbue/application/imbue/log/celeryd.log
CELERYD_OPTS=" --loglevel=DEBUG"
CELERY_WORKERS=2
CELERY_PROCESSES=16
cd /usr/local/src/imbue/application/imbue/conf
exec celery multi start $CELERY_WORKERS -P processes -c $CELERY_PROCESSES -n celeryd@{HOSTNAME} -f $CELERY_LOGFILE $CELERYD_OPTS

Similar: Running celeryd_multi with supervisor How to use Supervisor + Django + Celery with multiple Queues and Workers?


Solution

  • Since supervisor monitors(start/stop/restart) process, the process should be run in foreground(should not be daemonized).

    Celery multi daemonizes itself, so it can't be run with supervisor.

    You can create separate process for each worker and group them into one.

    [program:worker1]
    command=celery worker -l info -n worker1
    
    [program:worker2]
    command=celery worker -l info -n worker2
    
    [group:workers]
    programs=worker1,worker2
    

    You can also write a shell script which makes daemon process run in foreground like this.

    #! /usr/bin/env bash
    set -eu
    
    pidfile="/var/run/your-daemon.pid"
    command=/usr/sbin/your-daemon
    
    # Proxy signals
    function kill_app(){
        kill $(cat $pidfile)
        exit 0 # exit okay
    }
    trap "kill_app" SIGINT SIGTERM
    
    # Launch daemon
    $ celery multi start 2 -l INFO
    
    sleep 2
    
    # Loop while the pidfile and the process exist
    while [ -f $pidfile ] && kill -0 $(cat $pidfile) ; do
        sleep 0.5
    done
    exit 1000 # exit unexpected