I'm using FasterCSV on a Ruby on Rails application and currently it throws an Exception if the file is invalid.
I've looked over the FasterCSV doc, and it seems that if I use FasterCSV::parse with a block, it'll read the file one line at a time, without allocating too much memory. It'll throw a FasterCSV::MalformedCSV exception if there is any kind of error on the file.
I've implemented a custom solution, but I'm not sure it's the best possible one (see my answer below). I'd be interested in knowing alternatives
I made some tests yesterday and it turns out that my solution didn't quite work; I kept getting empty arrays on valid CSVs after implementing the first is_valid
. I'm not sure whether it's a FasterCSV caching issue or something in my code, and I don't know if it's related with my test setup, but I decided to go implement a safe_parse
instead:
#/lib/faster_csv_safe_parse.rb
class FasterCSV
def self.safe_parse(file, options = {})
begin
FasterCSV.parse(file, options)
rescue FasterCSV::MalformedCSVError
nil
end
end
end
This will return a parsed array if the file is valid, or nil
otherwise. I could then implement my validations as follows:
# /models/csv_importer.rb
class CsvImporter
include ActiveRecord::Validations
validates_presence_of :file
validate check_file_format
attr_accessor csv_data
def csv_data
@csv_data ||= FasterCSV.safe_parse(file)
end
...
private
def check_file_format
errors.add :file, "Malformed CSV! Please check syntax" if csv_data.nil?
end
end
I guess it would be possible to implement a safe_parse
that accepts a block and parses the file line by line, but for my purposes this simple implementation was enough, and it works in all cases.