Search code examples
rubysinatrarackthineventmachine

High memory consumption downloading large files on Sinatra and Thin


I'm running a Sinatra app on Thin.

Here's a simplified look of the code:

class StreamApp < Sinatra::Base
  get "/" do
    s3_object = # large S3 object (not loaded into memory)
    stream do |out|
      s3_object.read do |chunk|
        out << chunk
      end
    end
  end
end

As the streaming goes on, the memory on the box starts going up to the point that it reaches the max and the process just dies.

I have read articles from back in 2009 that this was an issue with EventMachine and Rack buffering the data until the entire response was complete.

Has anyone seen this issue or found a workaround for this?


Solution

  • The way streaming in sinatra works under eventmachine is that for each call to out << chunk sinatra schedules a call in eventmachine to send the chunk. Problem with your code is that it is blocking eventmachines event-loop until the entire file is read and the read is done. So nothing will be sent untill the entire data is in memory.

    this could be worked around by doing something like:

    get "/" do
    s3_object = # large S3 object (not loaded into memory)
      stream :keep_open do |out|
        reader = lambda {
           chunk = s3_object.read
           break if chunk == nil
           out << chunk
           EM::next_tick &reader
        }
        reader.call
      end
    end
    

    this will read one chunk as soon as eventmachine is ready without blocking the event loop. Of course in this case s3_object.read need to only return one chunk at the time.