I have a docker that runs django celery worker via supervisord, the program setup is pretty simple
[program:celery_priority]
command=python manage.py celery worker -E -Q priority --concurrency=2 --loglevel=ERROR
directory=/var/lib/app
stdout_events_enabled = true
stderr_events_enabled = true
stopwaitsecs = 600
[program:celery_medium]
command=python manage.py celery worker -E -Q medium --concurrency=2 --loglevel=ERROR
directory=/var/lib/app
stdout_events_enabled = true
stderr_events_enabled = true
stopwaitsecs = 600
[program:celerycam]
command=python manage.py celerycam
directory=/var/lib/app
stdout_events_enabled = true
stderr_events_enabled = true
stopwaitsecs = 600
Our deployment cycle uses fig to manage dockers, here is how our fig.yml file looks like for the worker
worker:
build: .docker/worker
command: normal
volumes_from:
- appdata
hostname: workerprod
domainname: project.internal
links:
- redis
- rabbit
- appdata
- mail
The problem that we are facing is that, when we try to use fig restart worker
the supervisord program fails because it finds conflict in pid with following error
[130.211.XX.XX] out: worker_1 | celery_medium stderr | [2015-02-13 13:40:54,271: WARNING/MainProcess] ERROR: Pidfile (/tmp/med_celery.pid) already exists.
[130.211.XX.XX] out: worker_1 | Seems we're already running? (pid: 17)
[130.211.XX.XX] out: worker_1 | celery_priority stderr | [2015-02-13 13:40:54,272: WARNING/MainProcess] ERROR: Pidfile (/tmp/priority_celery.pid) already exists.
[130.211.XX.XX] out: worker_1 | Seems we're already running? (pid: 16)
[130.211.XX.XX] out: worker_1 | 2015-02-13 18:40:54,359 INFO exited: celery_medium (exit status 0; expected)
[130.211.XX.XX] out: worker_1 | 2015-02-13 18:40:54,359 INFO exited: celery_priority (exit status 0; expected)
[130.211.XX.XX] out: worker_1 | 2015-02-13 18:40:55,360 INFO success: celerycam entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Yet, when we use fig -d up worker
it works, because apparently with up
fig tried to re-create the container and not use existing one. But this causes all linked services to recreate too and hence lose RabbitMQ data and Redis cache.
Is there a way to restart the docker using simple fig restart worker
and making sure that pid clears when restarting? please advise
Create an ENTRYPOINT
script that cleans up any state data before running your CMD
. E.g.
FROM someotherimage
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
And in entrypoint.sh
:
#!/bin/sh
rm -f /tmp/*.pid
exec "$@"
The ENTRYPOINT
script will run every time the container starts, and will ensure that any pid files in /tmp
are cleared out before running the container command.