Rails 3.2.2
Ruby 1.9.3
all browsers I have tried (Aurora, IE, Chrome)
Windows 7
I have enjoyed being able to reliably pass unicode back and forth between the database and the client. So far as I have been able to tell it works flawlessly.
However that only makes the problem I am having even more vexing. When working within Rails with string literals that contain various special characters (in a method defined for a model for example) I am causing Webrick to completely fail, and return a 500 "We're sorry, but something has happened" error.
For example suppose I have the string "O₂". I can post that value into a form and get it again later on and everything comes out fine. But say I have a method
def fix_name molecule_name
fixed_name = molecule_name
case molecule_name
when 'O2' then fixed_name = 'O₂'
end
return fixed_name
end
Then if I call fix_name, even if the case falls through without matching, the server fails abruptly (right after saying it has successfully rendered the generic new.html page).
Furthermore, if I switch to specifying unicode directly, as in
def fix_name molecule_name
fixed_name = molecule_name
case molecule_name
when 'O2' then fixed_name = "O\x20\x82"
end
return fixed_name
end
I get the generic "�" character instead of "₂".
Has anyone else had this problem? What could be going on here?
updated
Okay, having educated myself a little better on Unicode and UTF-8 I am able to be a little less neanderthal about this.
The fix for the code I posted is either
def fix_name molecule_name
fixed_name = molecule_name
case molecule_name
when 'O2' then fixed_name = "O\xe2\x82\x82"
end
return fixed_name
end
or probably better:
def fix_name molecule_name
fixed_name = molecule_name
case molecule_name
when 'O2' then fixed_name = "O\u{2082}"
end
return fixed_name
end
So the moral there is to get the byte code right!
But that still doesn't explain why I can't put the literal character in.
Ruby 1.9.3 is, let us say, 'stringent' about encoding in your rb files.
http://yehudakatz.com/2010/05/05/ruby-1-9-encodings-a-primer-and-the-solution-for-rails/
http://blog.grayproductions.net/articles/ruby_19s_three_default_encodings
the short answer is you should try adding:
# encoding: UTF-8
to the top of your .rb file, and then ruby 1.9.3 should not choke on UTF chars in your code.