I'm trying to insert a large amount of information into a Sqlite3 database using a ruby script. After 250 db_prepare_location.execute's to do this, it stops working saying:
.rvm/gems/ruby-1.9.2-p290/gems/sqlite3-1.3.6/lib/sqlite3/statement.rb:67:in `step': unable to open database file (SQLite3::CantOpenException)
from /Users/ashley/.rvm/gems/ruby-1.9.2-p290/gems/sqlite3-1.3.6/lib/sqlite3/statement.rb:67:in `execute'
from programs.rb:57:in `get_program_details'
from programs.rb:22:in `block in get_link'
from /Users/ashley/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/csv.rb:1768:in `each'
from /Users/ashley/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/csv.rb:1202:in `block in foreach'
from /Users/ashley/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/csv.rb:1340:in `open'
from /Users/ashley/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/csv.rb:1201:in `foreach'
from programs.rb:20:in `get_link'
from programs.rb:63:in `<module:Test>'
from programs.rb:15:in `<main>'
And here's my code:
require 'net/http'
require 'json'
require 'nokogiri'
require 'open-uri'
require 'csv'
require 'sqlite3'
require "bundler/setup"
require "capybara"
require "capybara/dsl"
Capybara.run_server = false
Capybara.default_driver = :selenium
Capybara.current_driver = :selenium
module Test
class Tree
include Capybara::DSL
def get_link
CSV.foreach("links.csv") do |row|
link = row[0]
get_details(link)
end
end
def get_details(link)
db = SQLite3::Database.open "development.sqlite3"
address = []
address_text = []
visit("#{link}")
name = find("#listing_detail_header").find("h3").text
page.find(:xpath, "//div[@id='listing_detail_header']").all(:xpath, "//span/span").each {|span| address << span }
if address.size == 4
street_address = address[0].text
address.shift
address.each {|a| address_text << a.text }
city_state_address = address_text.join(", ")
else
puts link
street_address = ""
city_state_address = ""
end
if page.has_css?('.provider-click_to_call')
find(".provider-click_to_call").click
phone_number = find("#phone_number").text.gsub(/[()]/, "").gsub(" ", "-")
else
phone_number = ""
end
if page.has_css?('.provider-website_link')
website = find(".provider-website_link")[:href]
else
website = ""
end
description = find(".listing_details_list").find("p").text
db_prepare_location = db.prepare("INSERT INTO programs(name, city_state_address, street_address, phone_number, website, description) VALUES (?, ?, ?, ?, ?, ?)")
db_prepare_location.bind_params name, city_state_address, street_address, phone_number, website, description
db_prepare_location.execute
end
end
test = Test::Tree.new
test.get_link
end
What is the problem here and what can I do to fix it? Let me know if additional info is needed.
You could be running out file descriptors. Every time you call get_details
, you open the SQLite database:
db = SQLite3::Database.open "development.sqlite3"
but you never explicitly close it; instead, you're relying on the garbage collector to clean up all your db
s and close all your file descriptors. Each time you open the database, you need to allocate a file descriptor, closing the database frees the file descriptor. If you're calling get_details
faster than the GC can clean things up, you will run out of file descriptors and subsequent SQLite3::Database.open
calls will fail.
Try adding db.close
at the end of get_details
.
You'll probably have to close the prepared statement as well so you should db_prepare_location.close
before db.close
:
def get_details
#...
db_prepare_location.close
db.close
end
Yes, Ruby has garbage collection but that doesn't mean that you don't have to manage your resources by hand.
Another option (which DGM was hinting at) would be to open a connection to the database in your constructor:
def initialize
@db = SQLite3::Database.open "development.sqlite3"
end
and then drop your SQLite3::Database.open
call in get_details
and use @db
instead. You wouldn't need a db.close
in get_details
anymore but you'd still want the db_prepare_location.close
call.