Search code examples
rubyherokuscreen-scraping

Heroku Friendly Ruby Scraper (with AJAX)?


So I'm trying to write a small unofficial API in Ruby to pull data from a site. I like Mechanize, but all the data I need from the page is generated by AJAX so Mechanize doesn't see it at all. What can I use to render a page with JavaScript so that I can scrape the data? I think something like spynner but for Ruby would do the trick.

I would also like to play with Heroku, so I'm looking for something that could be deployed there, which leads me away from something like Watir.

Does anything like this exist?

Update

For clarity, I'm trying to pull workout data from a Fitocracy profile page.

You may need an account before you can view the page, but basically all the workout data is displayed via JavaScript inside a page shell.


Solution

  • An ajax request is the same as a non-ajax request, it's just not always obvious how to make it. Mechanize can make any request that a browser can make. Sure Watir is easier but if this is for an API you should do it the right way and use mechanize.