Search code examples
rubyputs

Ruby sometimes prints unicode-escaped chars instead of the chars themselves. Why?


I use ruby 2.2.3p173, and Sublime 3. All my string already on encode('utf-8'). In Linux I have, what I expect. Try make output:

a = [ "Привет" ]
puts "puts=#{a}"
p "p=#{a}"
print "print=#{a}\n"

puts a[ 0 ]
p a[ 0 ]
print a[ 0 ] + "\n"
p a[ 0 ].encoding
p __ENCODING__

Output is:

puts=["\u041F\u0440\u0438\u0432\u0435\u0442"]
"p=[\"\\u041F\\u0440\\u0438\\u0432\\u0435\\u0442\"]"
print=["\u041F\u0440\u0438\u0432\u0435\u0442"]
Привет
"\u041F\u0440\u0438\u0432\u0435\u0442"
Привет
#<Encoding:UTF-8>
#<Encoding:UTF-8>

I expect:

puts=["Привет"]
"p=[\"Привет\"]"
print=["Привет"]
Привет
"Привет"
Привет

How I can print Array with several "utf-8" string in one line?


Solution

  • This is particularly observed on Windows. On other *nix systems, it may work alright - For instance, on https://repl.it/ you see right output, as it is most likely hosted on a *nix system.

    As per documentation, Array#to_s is

    Alias for: inspect

    Array#inspect invokes inspect on each of the array member, in this case, array members are all strings, thus String#inspect will be invoked.

    String#inspect uses Encoding#default_external as specified in documentation for inspection results.

    The default external encoding is used by default for strings created from the following locations:

    • CSV
    • File data read from disk
    • SDBM
    • StringIO
    • Zlib::GzipReader
    • Zlib::GzipWriter
    • String#inspect
    • Regexp#inspect

    On Windows, default external encoding is not UTF-8, and hence, we see the escaped sequences in output of String#inspect.

    p Encoding::default_external
    #=> #<Encoding:IBM437>
    

    We change the default external encoding to UTF-8, output will be proper:

    Encoding::default_external = Encoding::UTF_8
    
    a = [ "Привет" ]
    puts "puts=#{a}"
    #=> puts=["Привет"]
    
    p "p=#{a}"
    #=> "p=[\"Привет\"]"