I'm looking to extract quite a lot of data from spreadsheets for a Rails application, but I'm doing the extraction through the Rails Console.
I have a table called instititutes
in my database which currently has around 170 records. I've found far better data in a spreadsheet with c.1000 records and want to import that to my database, but to skip any records that match already.
I thought that using the find_or_create_by method would be the best move for this. The spreadsheet has UKPRN
and NAME
as headers, and the table has ukprn
and name
as respective columns (as well as others).
Using the Roo gem, I've got this so far:
require 'roo'
xlsx = Roo::Spreadsheet.open(File.expand_path('../Downloads/UKPRN.xlsx'))
xlsx.default_sheet = xlsx.sheets.last
header = xlsx.row(1)
xlsx.each_row do |row|
row = Institute.find_or_create_by(UKPRN: ukprn , NAME: name)
end
This is giving the error NameError: unitialized constant UKPRN
. I'm still getting my head back into Ruby so any help would be appreciated.
I'm using Ruby 2.2.3 and Rails 5.0.1
Edit:
p row
shows:
["UKPRN", "NAME"]
[10000291, "Anglia Ruskin University"]
[10000385, "The Arts University Bournemouth"]
[10000571, "Bath Spa University"]
[10000712, "University College Birmingham"]
[10000824, "Bournemouth University"]
[10000886, "The University of Brighton"]
[10000961, "Brunel University London"]
...etc
Your table has ukprn
and name
as respective columns, so find_or_create
should look like :
Institute.find_or_create_by(ukprn: ukprn , name: name)
Now you just need to initialize ukprn
and name
from row
.
require 'roo'
xlsx = Roo::Excelx.new(File.expand_path('../Downloads/UKPRN.xlsx'))
xlsx.each_row_streaming(offset: 1) do |row|
Institute.find_or_create_by(ukprn: row[0].value, name: row[1].value)
end
To execute this code, either :
db/seeds.rb
and execute rake db:seed
script.rb
and run rails runner script.rb