Search code examples
ruby-on-railsrubycsvfastercsv

FasterCSV: Check whether a file is invalid before accepting it - is there a simpler way?


I'm using FasterCSV on a Ruby on Rails application and currently it throws an Exception if the file is invalid.

I've looked over the FasterCSV doc, and it seems that if I use FasterCSV::parse with a block, it'll read the file one line at a time, without allocating too much memory. It'll throw a FasterCSV::MalformedCSV exception if there is any kind of error on the file.

I've implemented a custom solution, but I'm not sure it's the best possible one (see my answer below). I'd be interested in knowing alternatives


Solution

  • I made some tests yesterday and it turns out that my solution didn't quite work; I kept getting empty arrays on valid CSVs after implementing the first is_valid . I'm not sure whether it's a FasterCSV caching issue or something in my code, and I don't know if it's related with my test setup, but I decided to go implement a safe_parse instead:

    #/lib/faster_csv_safe_parse.rb
    class FasterCSV
    
      def self.safe_parse(file, options = {})
        begin
          FasterCSV.parse(file, options)
        rescue FasterCSV::MalformedCSVError
          nil
        end
      end
    
    end
    

    This will return a parsed array if the file is valid, or nil otherwise. I could then implement my validations as follows:

    # /models/csv_importer.rb
    
    class CsvImporter
      include ActiveRecord::Validations
    
      validates_presence_of :file
      validate check_file_format
      attr_accessor csv_data
    
      def csv_data
        @csv_data ||= FasterCSV.safe_parse(file)
      end
    
    ...
    
      private
    
      def check_file_format
        errors.add :file, "Malformed CSV! Please check syntax" if csv_data.nil?
      end
    end
    

    I guess it would be possible to implement a safe_parse that accepts a block and parses the file line by line, but for my purposes this simple implementation was enough, and it works in all cases.