In ruby code I am running a system call with Open3.popen3 and using the resultant IO for stdout and stderr to do some log message formatting before writing to one log file. I was wondering what would be the best way to do this so log messages will maintain the correct order, note I need to do separate formatting for error messages as for stdout messages.
Here's my current code (Assume logger is thread safe)
Open3.popen3("my_custom_script with_some_args") do |_in, stdout, stderr|
stdout_thr = Thread.new do
while line = stdout.gets.chomp
logger.info(format(:info, line))
end
end
stderr_thr = Thread.new do
while line = stderr.gets.chomp
logger.error(format(:error, line))
end
end
[stdout_thr, stderr_thr].each(&:join)
end
This has worked for me so far, but I'm not so confident that I can guarantee the correct order of the log messages. Is there a better way?
What you're trying to achieve is not possible with a guarantee. First thing to note is that your code could only possibly order based on the time that the data was received, not when it was produced, which is not quite the same. The only way to guarantee this would be to do something on the source which will add some guaranteed ordering between the two systems.
The below code should make it "more likely" to be correct by removing the threads. Assuming that you're using MRI, the threads are "green" so technically can't be running at the same time. That means you're beholden upon the scheduler choosing to run your thread at the "right" time.
Open3.popen3("my_custom_script with_some_args") do |_in, stdout, stderr|
for_reading = [stdout, stderr]
until(for_reading.empty?) do
wait_timeout = 1
# IO.select blocks until one of the streams is has something to read
# or the wait timeout is reached
readable, _writable, errors = IO.select(for_reading, [], [], wait_timeout)
# readable is nil in the case of a timeout - loop back again
if readable.nil?
Thread.pass
else
# In the case that both streams are readable (and thus have content)
# read from each of them. In this case, we cannot guarantee any order
# because we recieve the items at essentially the same time.
# We can still ensure that we don't mix data incorrectly.
readable.each do |stream|
buffer = ''
# loop through reading data until there is an EOF (value is nil)
# or there is no more data to read (value is empty)
while(true) do
tmp = stream.read_nonblock(4096, buffer, exception: false)
if tmp.nil?
# stream is EOF - nothing more to read on that one..
for_reading -= [stream]
break
elsif tmp.empty? || tmp == :wait_readable
# nothing more to read right now...
# continue on to process the buffer into lines and log them
break
end
end
if stream == stdout
buffer.split("\n").each { |line| logger.info(format(:info, line)) }
elsif stream == stderr
buffer.split("\n").each { |line| logger.info(format(:error, line)) }
end
end
end
end
end
Note that in a system generating a lot of output in a very short period of time there is more likely to be an overlap where things get out of order. This likelihood increases with the amount time taken to read the stream and process it. It would be best to ensure that the absolute minimum processing is done inside the loop. If the formatting (and writing) are expensive, consider moving those items into a separate thread reading from a single queue, and have the code inside the loop only push the buffer (and source identifier) onto the queue.