Search code examples
ruby-on-railsamazon-s3jrubyjrubyonrailstorquebox

Foreground or background image manipulations in Rails (Jruby, Torquebox)


I have uploading photos with ajax, manipulations and uploading to s3 take a lot of time. I heard that it's better to complete that tasks on background. My app need to wait while photos become uploaded. But if I choose background way then I will need to work with websockets or repeat ajax to check result(links to s3) ( I'm not happy about this). Why is it too bad to make hard calculations right in controller (foreground)? Now I use Torquebox(Jruby) and as I understand it has perfect concurrency. Does it mean that waiting uploading to s3 will not take resources and all will work fine? Please write about pros and cons of back/fore ground in my situation. Thank you!


Solution

  • It is generally considered bad practice to block a web request handler on a network request to a third party service. If that service should become slow or unavailable, this can clog up all your web processes, regardless of what ruby you are using. This is what you are referring to as 'foreground.'

    Essentially this is the flow of your current setup (in foreground):

    1. a user uploads an image on your site and your desired controller receives the request.
    2. Your controller makes a synchronous request to s3. This is a blocking request.
    3. Your controller waits
    4. Your controller waits
    5. Your controller (continues) to wait
    6. finally, (and this is not guaranteed) you receive a response from s3 and your code continues and renders your given view/json/text/etc.

    Clearly steps 3-5 are very bad news for your server, and as I stated earlier, this worker/thread/process (Depending on your ruby/rails server framework) will be 'held up' until the response from s3 is received (which potentially could never happen).

    Here is the same flow with a background job with some javascript help on the front-end for notification:

    1. a user uploads an image on your site and your desired controller receives the request.
    2. Your controller creates a new thread/process to make the request to s3. This is a non-blocking approach. You set a flag on a record that references your s3 image src, for example completed: false and your code continues nicely to step 3. Your new thread/process will be the one waiting for a response from s3 now, and you will set the 'completed' flag to true when s3 responds.
    3. You render your view/json/text/etc, and inherently release your worker/thread/process for this request...good news!

    now for the fun front end stuff:

    1. your client receives your response, triggering your front-end javascript to start a setInterval-like repetitive function that 'pings' your server every 3-ish seconds, where your back-end controller checks to see if the 'completed' flag that you set earlier is true, and if so, respond/render true.
    2. your client side javascript receives your response and either continues to ping (until you designate that it should give up) or stop pinging because your app responded true.

    I hope this sets you on the right path. I figured writing code for this answer was inferior because it seemed like you were looking for pros and cons. For actual implementation ideas, I would look into the following:

    • the sidekiq is excellent for solving the background job issues described here. It will handle creating the new process where you can make the request to s3.
    • here is an excellent railscast that will help you get a better understanding of the code.