Search code examples
rubyhexdata-conversionfrenchruby-smpp

Ruby hex to string conversion involving French characters


This is related to ruby gem ruby-smpp, which I'm using for my project.

I have a string of bytes \u0000\xE0\u0000\xE2\u0000\xE1\u0000\xE8\u0000\xEA\u0000\xE9\u0000\xE7. It represents the body of a message in French received (i.e. MO, or mobile-originating) message. The actual content of this message is àâáèêéç. Just wondering how to convert \u0000\xE0\u0000\xE2\u0000\xE1\u0000\xE8\u0000\xEA\u0000\xE9\u0000\xE7 to àâáèêéç in Ruby.

I've tried

["\u0000\xE0\u0000\xE2\u0000\xE1\u0000\xE8\u0000\xEA\u0000\xE9\u0000\xE7"].pack('H*')

=> "\x00\x02\x01\b\n\t\a"

and

['E0','E2','E1','E8', 'EA', 'E9', 'E7'].pack('H*')
=> "\xE0"

Both are wrong.

Thanks in advance!


Solution

  • Looks like your string is UTF-16BE encoded:

    str = "\u0000\xE0\u0000\xE2\u0000\xE1\u0000\xE8\u0000\xEA\u0000\xE9\u0000\xE7"
    
    str.encode('UTF-8', 'UTF-16BE')
    #=> "àâáèêéç"