Search code examples
rubycsvapostrophe

read CSV apostrophe


I am trying to read a CSV and Ruby stops reading once it encounters an arrow in CSV. The arrow is supposed to be apostrophe. I can't replace it in CSV because when I copy and paste, I paste a space.

I tried to use CSV.foreach or File.open, then read each_line. Both methods have the same problem.

The character is SUB in black in text editor.

How shall I solve this problem?

CSV.foreach(filename) do |row|
 puts row
end

File.open(filename, "r") do |f|
 f.each_line do |row|       
     puts row
 end
end

enter image description here


Solution

  • If your file isn't encoded the way Ruby expects by default, you need to specify the encoding manually when you call foreach, which would look like this:

    CSV.foreach(filename, encoding: Encoding::UTF_8)
    

    If you're not sure how the file is encoded, you can use String#encode as a pretty heavy hammer to clean it out, though you'll lose some characters in the process.

    File.read(filename).encode(
      Encoding::UTF_8,
      undef: :replace,
      invalid: :replace,
      replace: '' 
    )
    

    This says if the character is undefined or invalid, replace it, and replace it with an empty string. Of course, you can tweak options to get the result you'd like.