I am encountering an interesting issue with Ruby TCPServer, where once a client connects, it continually uses more and more CPU processing power until it hits 100% and then the entire system starts to bog down and can't process incoming data.
The processing class that is having an issue is designed to be a TCP Client that receives data from an embedded system, processes it, then returns the processed data to be further used (either by other similar data processors, or output to a user).
In this particular case, there is an external piece of code that would like this processed data, but cannot access it from the main parent code (the thing that the original process class is returning it's data to). This external piece may or may not be connected at any point while it is running.
To solve this, I set up a Thread with a TCPServer, and the processing class continually adds to a queue, and the Thread pulls from the queue and sends it to the client.
It works great, except for the performance issues. I am curious if I have something funky going on in my code, or if it's just the nature of this methodology and it will never be performant enough to work.
Thanks in advance for any insight/suggestions with this problem!
Here is my code/setup, with some test helpers:
process_data.rb
require 'socket'
class ProcessData
def initialize
super
@queue = Queue.new
@client_active = false
Thread.new do
# Waiting for connection
@server = TCPServer.open('localhost', 5000)
loop do
Thread.start(@server.accept) do |client|
puts 'Client connected'
# Connection established
@client_active = true
begin
# Continually attempt to send data to client
loop do
unless @queue.empty?
# If data exists, send it to client
begin
until @queue.empty?
client.puts(@queue.pop)
end
rescue Errno::EPIPE => error
# Client disconnected
client.close
end
end
sleep(1)
end
rescue IOError => error
# Client disconnected
@client_active = false
end
end # Thread.start(@server.accept)
end # loop do
end # Thread.new do
end
def read(data)
# Data comes in from embedded system on this method
# Do some processing
processed_data = data.to_i + 5678
# Ready to send data to external client
if @client_active
@queue << processed_data
end
return processed_data
end
end
test_embedded_system.rb (source of the original data)
require 'socket'
@data = '1234'*100000 # Simulate lots of data coming ing
embedded_system = TCPServer.open('localhost', 5555)
client_connection = embedded_system.accept
loop do
client_connection.puts(@data)
sleep(0.1)
end
parent.rb (this is what will create/call the ProcessData class)
require_relative 'process_data'
processor = ProcessData.new
loop do
begin
s = TCPSocket.new('localhost', 5555)
while data = s.gets
processor.read(data)
end
rescue => e
sleep(1)
end
end
random_client.rb (wants data from ProcessData)
require 'socket'
loop do
begin
s = TCPSocket.new('localhost', 5000)
while processed_data = s.gets
puts processed_data
end
rescue => e
sleep(1)
end
end
To run the test in linux, open 3 terminal windows:
Window 1: ./test_embedded_system.rb
Window 2: ./parent.rb
\CPU usage is stable
Window 3: ./random_client.rb
\CPU usage continually grows
I ended up figuring out what the issue was, and unfortunately I lead folks astray with my example.
It turns out my example didn't quite have the issue I was having, and the main difference was the sleep(1)
was not in my version of process_data.rb.
That sleep is actually incredibly important, because it is inside of a loop do
, and without the sleep, the Thread won't yield the GVL, and will continually eat up CPU resources.
Essentially, it was unrelated to TCP stuff, and more related to Threads and loops.
If you stumble on this question later on, you can put a sleep(0)
in your loops if you don't want it to wait, but you want it to yield the GVL.
Check out these answers as well for more info: Ruby infinite loop causes 100% cpu load