Search code examples
ruby-on-railsrubywikipediapluck

Using Wikipedia-Client Gem to Update Rails Database


My ruby and Rails is a bit rusty. I have a table in my database called institutes which has some of the columns filled. I want to use the Wikipedia-Client gem to fill some of the others. I want to use the name attribute to find the page on Wikipedia then use page.summary for the description attribute in my table and page.image_urls.first for the picture attribute. At the moment, I'm struggling to work out how I would go about this.

My current code is:

require 'Wikipedia'
Institute.each do |institute|
   school = institute.pluck(:name)
   page = Wikipedia.find(school)
   description = page.summary
   picture = page.image_urls.first
   Institute.update!(description: description, picture: picture)
end

I'm clearly doing something wrong here to do with the selection and use of the name attribute to find the Wikipedia page, but can't quite work it out. I think even if I were to pluck the name correctly, it wouldn't assign anything to the right id.

If there's also a way to drop the "The" at the beginning of the name in the Wikipedia search if it exists in :name, that would also be helpful as it seems some institutes drop this on Wikipedia.


Solution

  • You can try to use something like this:

    #use https://github.com/kenpratt/wikipedia-client
    require 'wikipedia'
    
    #select all Institutes through AR model
    Institute.all.each do |institute|
      #'institute' is an object, so we can get its name by dot operator
      school = institute.name
    
      #try to find school as is
      #then try to find without 'The'
      #and go ahead only if page exists
      page = Wikipedia.find(school)
      page = Wikipedia.find(school[3..-1].strip) if page.content.nil? and school[0..2].downcase == 'the'
      next if page.content.nil?
    
      description = page.summary
      picture = page.image_urls.first
      #update Institute object
      institute.update!(description: description, picture: picture)
    end