Search code examples
rubyruby-on-rails-3checkpoint

Need a simple, DRY, general-purpose checkpointing mechanism


Context: Many of the operations I'm doing require lengthy web accesses. Sometimes a web access fails and the process needs to be restarted. And it's a pain to restart the process from scratch.

So I've written a number of ad-hoc approaches to checkpointing: when you restart the process, it looks to see if checkpoint data is available and re-initializes state from that, otherwise it creates fresh state. In the course of operation, the process periodically writes checkpoint data somewhere (to a file or to the db). And when it's finished, it cleans up the checkpoint data.

I'd like a simple, DRY, general-purpose checkpointing mechanism. How would you write it? Or is there a module that already does this? (Though it's not an issue yet, extra stars awarded for thread-safe implementations!)


Solution

  • After mulling it over, I deciding that I'm willing to make this specific to ActiveRecord. By exploiting ruby's ensure facility and the destroyed? and changed? methods in ActiveRecord, the design becomes simple:

    define Checkpoint model with :name and :state

    # file db/migrate/xyzzy_create_checkpoints.rb
    class CreateCheckpoints < ActiveRecord::Migration
      def change
        create_table :checkpoints do |t|
          t.string :name
          t.string :state
        end
        add_index :checkpoints, :name, :unique => true
      end 
    end
    
    # file app/models/checkpoint.rb
    class Checkpoint < ActiveRecord::Base
      serialize :state
    end
    

    define WithCheckpoint module

    # file lib/with_checkpoint.rb
    module WithCheckpoint
    
      def with_checkpoint(name, initial_state, &body)
        r = Checkpoint.where(:name => name)
        # fetch existing or create fresh checkpoint
        checkpoint = r.exists? ? r.first : r.new(:state => initial_state)
        begin
          yield(checkpoint)
        ensure
          # upon leaving the body, save the checkpoint iff needed
          checkpoint.save if (!(checkpoint.destroyed?) && checkpoint.changed?)
        end
      end
    end
    

    sample usage

    Here's a somewhat contrived example that randomly blows up after some number of iterations. A more common case might be a lengthy network or file access that can fail at any point. Note: We store the state in an array only to show that 'state' needn't be a simple integer.

    class TestCheck
      extend WithCheckpoint
    
      def self.do_it
        with_checkpoint(:fred, [0]) {|ckp|
          puts("intial state = #{ckp.state}")
          while (ckp.state[0] < 200) do
            raise RuntimeError if rand > 0.99
            ckp.state = [ckp.state[0]+1]
          end
          puts("completed normally, deleting checkpoint")
          ckp.delete
        }
      end
    
    end
    

    When you run TestCheck.do_it, it might randomly blow up after some number of iterations. But you can re-start it until it completes properly:

    >> TestCheck.do_it
    intial state = [0]
    RuntimeError: RuntimeError
            from sketches/checkpoint.rb:40:in `block in do_it'
            from sketches/checkpoint.rb:22:in `with_checkpoint'
            ...
    >> TestCheck.do_it
    intial state = [122]
    completed normally, deleting checkpoint
    => #<Checkpoint id: 3, name: "fred", state: [200]>