I want to use Heroku but the fact they restart dynos every 24 hours at random times is making things a bit difficult.
I have a series of jobs dealing with payment processing that are very important, and I want them backed by the database so they're 100% reliable. For this reason, I chose DJ which is slow.
Because I chose DJ, it means that I also can't just push 5,000,000 events to the database at once (1 per each email send).
Because of THAT, I have longer running jobs (send 200,000 text messages over a few hours).
With these longer running jobs, it's more challenging to get them working if they're cut off right in the middle.
It appears heroku sends SIGTERM
and then expects the process to shut down within 30 seconds. This is not going to happen for my longer jobs.
Now I'm not sure how to handle them... the only way I can think is to update the database immediately after sending texts for instance (for example, a sms_sent_at
column), but that just means I'm destroying database performance instead of sending a single update query for every batch.
This would be a lot better if I could schedule restarts, at least then I could do it at night when I'm 99% likely not going to be running any jobs that don't take longer than 30 seconds to shut down.
Or.. another way, can I 'listen' for SIGTERM
within a long running DJ and at least abort the loop early so it can resume later?
Here's the proper answer, you listen for SIGTERM (I'm using DJ
here) and then gracefully rescue. It's important that the jobs are idempotent.
Long running delayed_job jobs stay locked after a restart on Heroku
class WithdrawPaymentsJob
def perform
begin
term_now = false
old_term_handler = trap('TERM') { term_now = true; old_term_handler.call }
loop do
puts 'doing long running job'
sleep 1
if term_now
raise 'Gracefully terminating job early...'
end
end
ensure
trap('TERM', old_term_handler)
end
end
end
Here's how you solve it with Que
:
if Que.worker_count.zero?
raise 'Gracefully terminating job early...'
end