Search code examples
ruby-on-railsmediawikiwikipedia-apimediawiki-api

Rails + MediaWiki API for Wikipedia data extraction


I am trying to use Rails to extract data from Wikipedia, based on a search term.

For example,

1) if I have the String "American Idol", I want to pass that to Wikipedia and get a list of the articles that relate to that. My goal will be to take the first 3 hyperlinks and display them on the website.

2) one step further would involve me extracting small pieces of data from Wikipedia - say the infobox, or the first few words of the wikipedia article.

Any tips?

Thanks!


Solution

  • You don't need to resort to screen-scraping, MediaWiki has a very comprehensive API for precisely this kind of thing. See https://github.com/jpatokal/mediawiki-gateway for a handy Ruby wrapper around it.

    Alternatively, if you're only interested in data like infoboxes, see DBpedia for the database version of Wikipedia.