Search code examples
rubyhashchecksum

Is there a quick and easy way to create a checksum from Ruby's basic data structures?


I have a data structure (Hash) that looks something like this:

{
    foo: "Test string",
    bar: [475934759, 5619827847]
}

I'm trying to create a checksum from that Hash to check for equality in the future. I tried using the hash method of the Hash, which resulted in a satisfyingly nice-looking hash, but it turns out that the same Hash will produce a different hash after the interpreter has been restarted.

I really just want to be able to create a ~128 bit checksum from a Hash, String or Array instance.

Is this possible?


Solution

  • You could calculate your own hash based on the object's Marshal dump or JSON representation.

    This calculates the MD5 hash of a Marshal dump:

    require 'digest/md5'
    
    hash = {
      foo: "Test string",
      bar: [475934759, 5619827847]
    }
    
    Marshal::dump(hash)
    #=> "\x04\b{\a:\bfooI\"\x10Test string\x06:\x06ET:\bbar[\ai\x04'0^\x1Cl+\b\x87\xC4\xF7N\x01\x00"
    
    Digest::MD5.hexdigest(Marshal::dump(hash))
    #=> "1b6308abdd8f5f6290e2825a078a1a02"
    

    Update

    You can implement your own strategy, although I would not recommend to change core functionality:

    class Hash
      def _dump(depth)
        # this doesn't cause a recursion because sort returns an array
        Marshal::dump(self.sort, depth)
      end
    
      def self._load(marshaled_hash)
        Hash[Marshal::load(marshaled_hash)]
      end
    end
    
    Marshal::dump({foo:1, bar:2})
    #=> "\x04\bu:\tHash\e\x04\b[\a[\a:\bbari\a[\a:\bfooi\x06"
    
    Marshal::dump({bar:2, foo:1})
    #=> "\x04\bu:\tHash\e\x04\b[\a[\a:\bbari\a[\a:\bfooi\x06"
    
    Marshal::load(Marshal::dump({foo:1, bar:2}))
    #=> {:bar=>2, :foo=>1}