Search code examples
supervisorderpnextfrappe

Corrupt Supervisor Config for Frappe Multi-Bench Setup


Setup Context:

  • Dedicated server CentOS with the frappe bench CLI installed on root.
  • Multiple Linux users on the server, each with a domain or sub-domain.
  • Each user with their own Bench & Sites. Manually configured NGINX & Redis configurations to facilitate this. All sites are fully functional.

Problem:

The scheduler, web & background workers not working and only work for the first installed site, in supervisor I get these errors

frappe-bench-t1-redis:frappe-bench-t1-redis-cache RUNNING pid 12402, uptime 17:12:10
frappe-bench-t1-redis:frappe-bench-t1-redis-queue RUNNING pid 12401, uptime 17:12:10 
frappe-bench-t1-redis:frappe-bench-t1-redis-socketio RUNNING pid 12405, uptime 17:12:10 
frappe-bench-t1-web:frappe-bench-t1-frappe-web RUNNING pid 12429, uptime 17:12:10 
frappe-bench-t1-web:frappe-bench-t1-node-socketio RUNNING pid 12428, uptime 17:12:10 
frappe-bench-t1-workers:frappe-bench-t1-frappe-default-worker-0 FATAL can't find command 'None' 
frappe-bench-t1-workers:frappe-bench-t1-frappe-long-worker-0 FATAL can't find command 'None' 
frappe-bench-t1-workers:frappe-bench-t1-frappe-schedule FATAL can't find command 'None' 
frappe-bench-t1-workers:frappe-bench-t1-frappe-short-worker-0 FATAL can't find command 'None' 
frappe-bench-redis:frappe-bench-redis-cache FATAL Exited too quickly (process log may have details) 
frappe-bench-redis:frappe-bench-redis-queue RUNNING pid 12386, uptime 17:12:10 
frappe-bench-redis:frappe-bench-redis-socketio RUNNING pid 12388, uptime 17:12:10 
frappe-bench-t2-redis:frappe-bench-t2-redis-cache RUNNING pid 12384, uptime 17:12:10 
frappe-bench-t2-redis:frappe-bench-t2-redis-queue RUNNING pid 12383, uptime 17:12:10 
frappe-bench-t2-redis:frappe-bench-t2-redis-socketio RUNNING pid 12385, uptime 17:12:10 
frappe-bench-t2-web:frappe-bench-t2-frappe-web RUNNING pid 12389, uptime 17:12:10 
frappe-bench-t2-web:frappe-bench-t2-node-socketio RUNNING pid 12390, uptime 17:12:10 
frappe-bench-t2-workers:frappe-bench-t2-frappe-default-worker-0 STOPPED Dec 31 07:00 PM 
frappe-bench-t2-workers:frappe-bench-t2-frappe-long-worker-0 STOPPED Dec 31 07:00 PM 
frappe-bench-t2-workers:frappe-bench-t2-frappe-schedule STOPPED Dec 31 07:00 PM 
frappe-bench-t2-workers:frappe-bench-t2-frappe-short-worker-0 STOPPED Dec 31 07:00 PM 
frappe-bench-web:frappe-bench-frappe-web RUNNING pid 12391, uptime 17:12:10 
frappe-bench-web:frappe-bench-node-socketio RUNNING pid 12400, uptime 17:12:10 
[truncated but repetitive lines]
frappe-bench-t3-web:frappe-bench-t3-frappe-web RUNNING pid 12381, uptime 17:12:10 
frappe-bench-t3-web:frappe-bench-t3-node-socketio RUNNING pid 12382, uptime 17:12:10 
frappe-bench-t3-workers:frappe-bench-t3-frappe-default-worker-0 FATAL can't find command 'None' 
frappe-bench-t3-workers:frappe-bench-t3-frappe-long-worker-0 FATAL can't find command 'None' 
frappe-bench-t3-workers:frappe-bench-t3-frappe-schedule FATAL can't find command 'None' 
frappe-bench-t3-workers:frappe-bench-t3-frappe-short-worker-0 FATAL can't find command 'None'

So when I open the supervisor.conf for the working site I see the bench path correctly like this:

program:frappe-bench-frappe-schedule]
command=/usr/local/bin/bench schedule
priority=3
autostart=true
autorestart=true
stdout_logfile=/home/t2/frappe-bench/logs/schedule.log
stderr_logfile=/home/t2/frappe-bench/logs/schedule.error.log
user=t2
directory=/home/t2/frappe-bench

[program:frappe-bench-frappe-default-worker]
command=/usr/local/bin/bench worker --queue default
priority=4
autostart=true

But in the none working sites the path is set to none and I tried to modify it manually and execute bench setup supervisor in the site but it did not work. Here is an image of the non working sites:

Screenshot


Solution

  • Looks like bench couldn't find its own entry point while setting up the supervisor.conf for the "broken" sites' bench.

    To manually fix this, you can replace the "None" from the conf and replace it with "/usr/local/bin/bench".


    As for what might've gone wrong, the bench CLI might not be available in the PATH of all the users. I'm guessing you've setup the config for some users as root, and others after logging into their respective users? This may be a result of a whole other host of possibilities. Executing bench setup production ${user} in the respective user's bench as root should work fine.