Search code examples
rubyxmlencodingaptanaconsole2

windows cmd doesn't type hebrew characters form Ruby app


I have a XML that contains also Hebrew characters, but when typing (the Windows 'CAT') it to the screen - it shows Gibrish. So, I installed Console2 and changed encoding (chcp) to either 862 or 1255 and then it is shown well, but then, when trying to parse it either using Nokogiri or REXML - it throws an exception - "malformed xml: missing tag start".

So, if anyone that knows how to get out of this mess, can guide me what to save in which encoding, and to get this work because I search the whole Web without a suitable solution.

I am using Windows XP 32 bit. My Ruby IDE is Aptana 3, and the Ruby versio is 1.8.7

Can anyone please help me here?


Solution

  • Ruby 1.8.7 is known to have problems with the encoding. There is a similar question here (see "Encoding in Ruby 1.8.7 or 1.9.2") which may help you to find the way. So install the gem character-encodings and require it in your ruby file. Then prefix the string you want to handle with u'myStrüng' (I have used a German Umlaut here as example).