Search code examples
ruby-on-railsrubyutf-8character-encoding

Force strings to UTF-8 from any encoding


In my rails app I'm working with RSS feeds from all around the world, and some feeds have links that are not in UTF-8. The original feed links are out of my control, and in order to use them in other parts of the app, they need to be in UTF-8.

How can I detect encoding and convert to UTF-8?


Solution

  • Ruby 1.9

    "Forcing" an encoding is easy, however it won't convert the characters just change the encoding:

    str = str.force_encoding('UTF-8')
    
    str.encoding.name # => 'UTF-8'
    

    If you want to perform a conversion, use encode:

    begin
      str.encode("UTF-8")
    rescue Encoding::UndefinedConversionError
      # ...
    end
    

    I would definitely read the following post for more information:
    http://graysoftinc.com/character-encodings/ruby-19s-string