Search code examples
ruby-on-railsrubydatabaseimport-from-excel

Extract Data from Excel Spreadsheet into Database in Ruby


I'm looking to extract quite a lot of data from spreadsheets for a Rails application, but I'm doing the extraction through the Rails Console.

I have a table called instititutes in my database which currently has around 170 records. I've found far better data in a spreadsheet with c.1000 records and want to import that to my database, but to skip any records that match already.

I thought that using the find_or_create_by method would be the best move for this. The spreadsheet has UKPRN and NAME as headers, and the table has ukprn and name as respective columns (as well as others).

Using the Roo gem, I've got this so far:

require 'roo'

xlsx = Roo::Spreadsheet.open(File.expand_path('../Downloads/UKPRN.xlsx'))
xlsx.default_sheet = xlsx.sheets.last
header = xlsx.row(1)
xlsx.each_row do |row|
    row = Institute.find_or_create_by(UKPRN: ukprn , NAME: name)
end

This is giving the error NameError: unitialized constant UKPRN. I'm still getting my head back into Ruby so any help would be appreciated.

I'm using Ruby 2.2.3 and Rails 5.0.1

Edit:

p row shows:

["UKPRN", "NAME"]
[10000291, "Anglia Ruskin University"]
[10000385, "The Arts University Bournemouth"]
[10000571, "Bath Spa University"]
[10000712, "University College Birmingham"]
[10000824, "Bournemouth University"]
[10000886, "The University of Brighton"]
[10000961, "Brunel University London"]
...etc

Solution

  • Your table has ukprn and name as respective columns, so find_or_create should look like :

    Institute.find_or_create_by(ukprn: ukprn , name: name)
    

    Now you just need to initialize ukprn and name from row.

    require 'roo'
    
    xlsx = Roo::Excelx.new(File.expand_path('../Downloads/UKPRN.xlsx'))
    
    xlsx.each_row_streaming(offset: 1) do |row|
      Institute.find_or_create_by(ukprn: row[0].value, name: row[1].value)
    end
    

    To execute this code, either :

    • put it in db/seeds.rb and execute rake db:seed
    • put it in script.rb and run rails runner script.rb
    • copy-paste it in console (not really recommended)