Search code examples
ruby-on-railsrubyherokuconcurrencyresque

How to make multiple parallel concurrent requests with Rails and Heroku


I am currently developing a Rails application which takes a long list of links as input, scrapes them using a background worker (Resque), then serves the results to the user. However, in some cases, there are numerous URLs and I would like to be able to make multiple requests in parallel / concurrency such that it would take much less time, rather than waiting for one request to complete to a page, scraping it, and moving on to the next one.

Is there a way to do this in heroku/rails? Where might I find more information?

I've come across resque-pool but I'm not sure whether it would solve this issue and/or how to implement. I've also read about using different types of servers to run rails in order to make concurrency possible, but don't know how to modify my current situation to take advantage of this.

Any help would be greatly appreciated.


Solution

  • Don't use Resque. Use Sidekiq instead.

    Resque runs in a single-threaded process, meaning the workers run synchronously, while Sidekiq runs in a multithreaded process, meaning the workers run asynchronously/simutaneously in different threads.

    Make sure you assign a URL to scrape per worker. It's no use if one worker scrape multiple URLs.

    With Sidekiq, you can pass the link to a worker, e.g.

    LINKS = [...]
    LINKS.each do |link|
      ScrapeWoker.perform_async(link)
    end
    

    The perform_async doesn't actually execute the job right away. Instead, the link is just put in a queue in redis along with the worker class, and so on, and later (could be milliseconds later) workers are assigned to execute each job in queue in its own thread by running the perform instance method in ScrapeWorker. Sidekiq will make sure to retry again if exception occur during execution of a worker.

    PS: You don't have pass a link to the worker. You can store the links to a table and then pass the ids of the records to workers.

    More info about sidekiq