Search code examples
rubymultithreadingstdoutstderrpopen3

How to avoid consecutive instantiation of Ruby Thread object in this code?


I never used Thread till now, but I think I must rely on it in this case. I would like to process the stdout and the stderr of a cURL command line separately, because I want to exchange the carriage returns in the progress indicator (which is written to stderr) to newlines:

require "open3"
cmd="curl -b cookie.txt #{url} -L -o -"
Open3.popen3(cmd) do |stdin, stdout, stderr, wait_thr|

  pid = wait_thr.pid 

  # I have to process stdout and stderr at the same time but
#asyncronously, because stdout gives much more data then the stderr
#stream. I instantiate a Thread object for reading the stderr, otherwise 
#"getc" would block the stdout processing loop.

  c=nil
  line=""
  stdout.each_char do |b| 
       STDOUT.print b

       if c==nil then
         c=""
         thr = Thread.new { 
         c=stderr.getc 
         if c=="\r" || c=="\n" then 
            STDERR.puts line 
            line=""
         else
          line<<c
         end
         c=nil
        }
  end

  #if stderr still holds some output then I process it:
  line=""
  stderr.each_char do |c|

         if c=="\r" || c=="\n" then 
            STDERR.puts line 
            line=""
         else
          line<<c
         end
  end

  exit_status = wait_thr.value.exitstatus 
  STDERR.puts exit_status

end #popen3

My question is how can I avoid making a new Thread instance at every loop cycle when processing stdout (stdout.each_char)? I think it is time consuming, I would like to instantiate once, and then use its methods like stop and run etc.


Solution

  • Generally, you can process one of stdout, stderr in the main thread, and instantiate another thread to process the other one. This is common practice to process multiple sources concurrently.

    You need to pay attention to the memory sharing in multi-thread context. In your case, line, stderr are shared and modified in multiple threads without synchronization, which will leads to unpredictable behavior.

    In most cases, Ruby handles the line endings for you. I don't quite get the need of handling \r, \n manually here.

    require "open3"
    cmd="curl -b cookie.txt #{url} -L -o -"
    Open3.popen3(cmd) do |stdin, stdout, stderr, wait_thr|
      pid = wait_thr.pid
    
      stdout_thread = Thread.new do
        # process stdout in another thread
        # you can replace this with the logic you want, 
        # if the following behavior isn't what you want
        stdout.each_line do |line|
          puts line
        end
      end
    
      # process stderr in the main thread
      stderr.each_line do |line|
        STDERR.puts line
      end
    
      # wait the stdout processing to be finished.
      stdout_thread.join
    end