Search code examples
delayed-jobmonitprocess-monitoring

Using monit to monit multiple delayed_jobs processes


I've looked at other SO questions about using the monit utility to monitor delayed_jobs processes, and none of them really provide a satisfactory answer to a "real-world" situation.

The issue is that monit is looking for a singular pidfile to monitor a process, and if you do a single delayed_job process, then everything works fine to start/stop delayed_jobs via monit:

# .monitrc config file
check delayed_jobs with pidfile path/to/app/tmp/pids/delayed_jobs.pid
  program start = "RAILS_ENV=production script/delayed_job start"
  program stop = "RAILS_ENV=production script/delayed_job stop"

However, in any real-world application, you will have multiple delayed_jobs processes that work on different queues.

# .monitrc config file
check delayed_jobs with pidfile path/to/app/tmp/pids/delayed_jobs.pid
  program start = "RAILS_ENV=production script/delayed_job --pool=mailers:1 --pool=default:2 --pool=priority:1 start"
  program stop = "RAILS_ENV=production script/delayed_job stop"

This will actually start 4 separate delayed_jobs processes, and each process will work on the respective queues listed in the pool option.

This does not generate a single path/to/app/tmp/pids/delayed_jobs.pid file, but instead generates multiple pidfiles, one for each separate process, and I don't know how to get monit to handle that.

For the "stop" action, I could execute a pkill -f delayed_jobs, and that works. So I can have the "start" action generate multiple processes, and the "stop" action can kill them all, but then the issue is that monit flags this process as does not exist because it doesn't have a single pidfile to monitor.

The brute force approach is having a separate monit process for every single delayed_jobs queue, and if I have multiple processes for a single queue, then prefix them with the process index, so I would have pidfiles like:

delayed_jobs.priority.pid
delayed_jobs.mailers.pid
delayed_jobs.default_1.pid
delayed_jobs.default_2.pid

This "works", but seems extremely verbose and painful to maintain.

Is there a cleaner way for monit to monitor a collection of processes, in this case specifically delayed_jobs?


Solution

  • The short answer is, Monit does not support this, sorry.

    Is there a cleaner way for monit to monitor a collection of processes, in this case specifically delayed_jobs?

    Your "brute force" approach is the only way to start/stop/handle a single job in a list of jobs.