Search code examples
vimgoiterm

golang: why does terminal stdout not print out utf-8 characters?


So golang is designed to handle unicode/utf-8 properly.

However, I seem to have problem getting utf-8 characters printed out in my terminal's standard output correctly.

The simplest program here:-

package main

import "fmt"

func main() {
    fmt.Println("Hello, 世界")
}

when executed, shows encoded characters.

$ go run hello.go
Hello, 世界

My terminal's locale is set correctly:

$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

and I am using vim with set encoding=utf-8 and set fileencodings=utf-8 included in my .vimrc file.

This might actually be a vim problem. I used nano to write this script from scratch and name it hello2.go and the print out actually prints Hello, 世界 correctly. But the original hello.go script, created with vim, only gives me the gibberish Hello, ‰∏ñÁïå.

So just double check that my vim-created hello.go script is a UTF-8 Unicode text, I run the file command on it. Like this:-

$ file hello.go
hello.go: C source, UTF-8 Unicode text

So what gives? Why does my vim created hello.go script above print out gibberish but my nano created hello2.go (which contains the same exact lines of code) does not?

$ file hello2.go
hello2.go: C source, UTF-8 Unicode text

In fact, when I open vim-created hello.go with nano, the source code now reads:-

package main

import "fmt"

func main() {
        fmt.Println("Hello 世界")
}

But if I open the same vim-created hello.go with vim, the source code reads:-

package main

import "fmt"

func main() {
    fmt.Println("Hello, 世界")
}

Why is this so?


Solution

  • These are the offending lines in my .vimrc that is causing me this problem.

    if has("gui_running")
        set guitablabel=%t%=%m  "Set the label of the tabs
        set nomacatsui anti enc=utf-8 tenc=macroman gfn=Monaco:h11
        " set window size
        set lines=40
        set columns=120
    else
        set enc=utf-8 tenc=macroman gfn=Monaco:h11
        set fenc=utf-8
    endif
    

    Specifically, tenc=macroman is screwing up my encoding at terminal level.

    Switched it to tenc=utf-8 and all is good.

    Wasted 4 hours of my life on this I-should-have-seen-this-coming problem! Ugh.