Search code examples
monitoringredisgod

Monitoring redis with god - monitoring conditions


Im trying to monitor redis with god but god tries to restart it even though its already running. This is my .god script (ported from http://blog.thomasmango.com/post/636319317/resque-in-production):

# Redis
%w{6379}.each do |port|
  God.watch do |w|
    w.name = "redis-server"
    w.interval = 30.seconds
    w.start = "/etc/init.d/redis-server start"
    w.stop = "/etc/init.d/redis-server stop"
    w.restart = "/etc/init.d/redis-server restart"
    w.start_grace = 10.seconds
    w.restart_grace = 10.seconds

    w.start_if do |start|
      start.condition(:process_running) do |c|
          c.interval = 5.seconds
          c.running = false
      end
    end
  end
end

Now when I start god like this:

god -c /home/phlegx/workspace/projectx/config/god/config.god -D --log-level debug

I get the following output:

I [2011-04-28 18:32:10]  INFO: Loading /home/phlegx/workspace/projectx/config/god/config.god
I [2011-04-28 18:32:10]  INFO: Syslog enabled.
I [2011-04-28 18:32:10]  INFO: Using pid file directory: /var/run/god
I [2011-04-28 18:32:10]  INFO: Started on drbunix:///tmp/god.17165.sock
I [2011-04-28 18:32:10]  INFO: redis-server move 'unmonitored' to 'up'
D [2011-04-28 18:32:10] DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x000000020929d0> in 0 seconds
I [2011-04-28 18:32:10]  INFO: redis-server moved 'unmonitored' to 'up'
I [2011-04-28 18:32:10]  INFO: redis-server [trigger] process is not running (ProcessRunning)
D [2011-04-28 18:32:10] DEBUG: redis-server ProcessRunning [true] {true=>:start}
I [2011-04-28 18:32:10]  INFO: redis-server move 'up' to 'start'
I [2011-04-28 18:32:10]  INFO: redis-server start: /etc/init.d/redis-server start
D [2011-04-28 18:32:20] DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x000000020929d0> in 0 seconds
I [2011-04-28 18:32:20]  INFO: redis-server moved 'up' to 'up'
I [2011-04-28 18:32:20]  INFO: redis-server [trigger] process is not running (ProcessRunning)
D [2011-04-28 18:32:20] DEBUG: redis-server ProcessRunning [true] {true=>:start}
I [2011-04-28 18:32:20]  INFO: redis-server move 'up' to 'start'
I [2011-04-28 18:32:20]  INFO: redis-server start: /etc/init.d/redis-server start

D [2011-04-28 18:32:30] DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x000000020929d0> in 0 seconds
I [2011-04-28 18:32:30]  INFO: redis-server moved 'up' to 'up'
I [2011-04-28 18:32:30]  INF (ProcessRunning)O: redis-server [trigger] process is not running
D [2011-04-28 18:32:30] DEBUG: redis-server ProcessRunning [true] {true=>:start}
I [2011-04-28 18:32:30]  INFO: redis-server move 'up' to 'start'
I [2011-04-28 18:32:30]  INFO: redis-server start: /etc/init.d/redis-server start

As you can see it always complains with:

INF (ProcessRunning)O: redis-server [trigger] process is not running

Does anyone know what this is might causing?

The PID that god writes out to "/var/run/god/redis-server.pid" seems not to be the same as when I do a "ps":

ps aux | grep redis
redis     7702  0.0  0.0   9876  1376 ?        Ss   18:00   0:01 /usr/bin/redis-server /etc/redis/redis.conf

Shouldn't the PID in "redis-server.pid" be the same as "ps" shows?


Solution

  • I'm not familiar with god (ehehe), but I suspect it expects Redis to not daemonize itself (just as daemontools does). If that is the case, you should turn of the self-daemonize in the redis config file. IIRC it is the "daemonize=yes" parameter, change it to no. That might work for you.

    Now, if this is the case, the reason the PID file can be different is that god may be starting it and recording the PID, then when the redis-server command returns it thinks Redis has died and tries to restart. At which point it will record a new PID. This is readily apparent if the PID that is running is smaller than the PID in the file.