Search code examples
rubyprocessshared-objects

Shared Variable Among Ruby Processes


I have a Ruby program that loads up two very large yaml files, so I can get some speed-up by taking advantage of the multiple cores by forking off some processes. I've tried looking, but I'm having trouble figuring how, or even if, I can share variables in different processes.

The following code is what I currently have:

@proteins = ""
@decoyProteins = "" 

fork do
  @proteins = YAML.load_file(database)
  exit
end

fork do
  @decoyProteins = YAML.load_file(database)
  exit
end

p @proteins["LVDK"]

P displays nil though because of the fork.

So is it possible to have the forked processes share the variables? And if so, how?


Solution

  • One problem is you need to use Process.wait to wait for your forked processes to complete. The other is that you can't do interprocess communication through variables. To see this:

    @one = nil
    @two = nil
    @hash = {}
    pidA = fork do
        sleep 1
        @one = 1
        @hash[:one] = 1
        p [:one, @one, :hash, @hash] #=> [ :one, 1, :hash, { :one => 1 } ]
    end
    pidB = fork do
        sleep 2
        @two = 2
        @hash[:two] = 2
        p [:two, @two, :hash, @hash] #=> [ :two, 2, :hash, { :two => 2 } ]
    end
    Process.wait(pidB)
    Process.wait(pidA)
    p [:one, @one, :two, @two, :hash, @hash] #=> [ :one, nil, :two, nil, :hash, {} ]
    

    One way to do interprocess communication is using a pipe (IO::pipe). Open it before you fork, then have each side of the fork close one end of the pipe.

    From ri IO::pipe:

        rd, wr = IO.pipe
    
        if fork
          wr.close
          puts "Parent got: <#{rd.read}>"
          rd.close
          Process.wait
        else
          rd.close
          puts "Sending message to parent"
          wr.write "Hi Dad"
          wr.close
        end
    
     _produces:_
    
        Sending message to parent
        Parent got: <Hi Dad>
    

    If you want to share variables, use threads:

    @one = nil
    @two = nil
    @hash = {}
    threadA = Thread.fork do
        sleep 1
        @one = 1
        @hash[:one] = 1
        p [:one, @one, :hash, @hash] #=> [ :one, 1, :hash, { :one => 1 } ] # (usually)
    end
    threadB = Thread.fork do
        sleep 2
        @two = 2
        @hash[:two] = 2
        p [:two, @two, :hash, @hash] #=> [ :two, 2, :hash, { :one => 1, :two => 2 } ] # (usually)
    end
    threadA.join
    threadB.join
    p [:one, @one, :two, @two, :hash, @hash] #=> [ :one, 1, :two, 2, :hash, { :one => 1, :two => 2 } ]
    

    However, I'm not sure if threading will get you any gain when you're IO bound.