Search code examples
rubymarshallingsortedset

Ruby Marshal.load doesn't keep order of sorted set


I'm saving a SortedSet object in a file using Marshal.dump. The elements in the set are objects as well (that include Comparable and implement the <=> method).

Later on when restoring that object using Marshal.load, the SortedSet that is loaded from the file is not sorted...

Any idea why or how to fix it?

Here is a simplified example that reproduce the problem:

require 'set'
class Foo
  include Comparable

  attr_accessor :num

  def initialize(num)
    @num = num
  end

  def <=>(other)
    num <=> other.num
  end
end

f1 = Foo.new(1)
f2 = Foo.new(2)
f3 = Foo.new(3)

s = SortedSet.new([f2, f1, f3])

File.open('set_test.dump', 'wb') { |f| Marshal.dump(s, f) }

Than, to load the object from the file i use -

File.open('set_test.dump', 'rb') { |f| ls = Marshal.load(f) }

** I'm using Rails 3.2.3 with Ruby 2.1.8

** When loading the dump from the file - do it in a new/seperate rails console (and don't forget to copy-paste the definition of the Foo class :-) )


Solution

  • Reproducing the bug

    I could reproduce this behaviour on every Ruby I tried.

    # write_sorted_set.rb
    require 'set'
    class Foo
      include Comparable
    
      attr_accessor :num
    
      def initialize(num)
        @num = num
      end
    
      def <=>(other)
        num <=> other.num
      end
    end
    
    f1 = Foo.new(1)
    f2 = Foo.new(2)
    f3 = Foo.new(3)
    
    s = SortedSet.new([f2, f1, f3])
    File.open('set_test.dump', 'wb') { |f| Marshal.dump(s, f) }
    p s.to_a
    

    and

    # load_sorted_set.rb
    require 'set'
    class Foo
      include Comparable
    
      attr_accessor :num
    
      def initialize(num)
        @num = num
      end
    
      def <=>(other)
        num <=> other.num
      end
    end
    
    ls = Marshal.load(File.binread('set_test.dump'))
    p ls.to_a
    

    When launching

    ruby write_sorted_set.rb && ruby load_sorted_set.rb
    

    It outputs

    [#<Foo:0x000000010cae30 @num=1>, #<Foo:0x000000010cae08 @num=2>, #<Foo:0x000000010cadb8 @num=3>]
    [#<Foo:0x0000000089be08 @num=2>, #<Foo:0x0000000089bd18 @num=1>, #<Foo:0x0000000089bc78 @num=3>]
    

    Why?

    Comparable isn't used

    Using this definition :

    class Foo
      attr_accessor :num
      def initialize(num)
        @num = num
      end
    end
    

    in load_sorted_set.rb should raise an exception (comparison of Foo with Foo failed (ArgumentError)), but it doesn't. It looks like SortedSet isn't properly initialized by Marshal.load

    lib/set.rb

    Looking at the sourcecode for SortedSet :

      module_eval {
        # a hack to shut up warning
        alias old_init initialize
      }
    

    and

          module_eval {
            # a hack to shut up warning
            remove_method :old_init
          }
    
          @@setup = true
        end
      end
    
      def initialize(*args, &block) # :nodoc:
        SortedSet.setup
        initialize(*args, &block)
      end
    end
    

    It looks like SortedSet has been patched to ensure that SortedSet.setup is executed before any SortedSet is initialized.

    Marshal.load doesn't seem to know about this.

    Solution

    SortedSet.setup

    You can call

    SortedSet.setup
    

    after require 'set' and before Marshal.load

    SortedSet.new

    You can force a SortedSet initialization with :

    ls = SortedSet.new(Marshal.load(File.binread('set_test.dump')))