Search code examples
ruby-on-railsrubyjsonencoding

JSON encoding wrongly escaped (Rails 3, Ruby 1.9.2)


In my controller, the following works (prints "oké")

puts obj.inspect

But this doesn't (renders "ok\u00e9")

render :json => obj

Apparently the to_json method escapes unicode characters. Is there an option to prevent this?


Solution

  • If you dig through the source you'll eventually come to ActiveSupport::JSON::Encoding and the escape method:

    def escape(string)
      if string.respond_to?(:force_encoding)
        string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
      end
      json = string.
        gsub(escape_regex) { |s| ESCAPED_CHARS[s] }.
        gsub(/([\xC0-\xDF][\x80-\xBF]|
               [\xE0-\xEF][\x80-\xBF]{2}|
               [\xF0-\xF7][\x80-\xBF]{3})+/nx) { |s|
        s.unpack("U*").pack("n*").unpack("H*")[0].gsub(/.{4}/n, '\\\\u\&')
      }
      json = %("#{json}")
      json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
      json
    end
    

    The various gsub calls are forcing non-ASCII UTF-8 to the \uXXXX notation that you're seeing. Hex encoded UTF-8 should be acceptable to anything that processes JSON but you could always post-process the JSON (or monkey patch in a modified JSON escaper) to convert the \uXXXX notation to raw UTF-8 if necessary.

    I'd agree that forcing JSON to be 7bit-clean is a bit bogus but there you go.

    Short answer: no.