So golang
is designed to handle unicode/utf-8 properly.
However, I seem to have problem getting utf-8 characters printed out in my terminal's standard output correctly.
The simplest program here:-
package main
import "fmt"
func main() {
fmt.Println("Hello, 世界")
}
when executed, shows encoded characters.
$ go run hello.go
Hello, 世界
My terminal's locale is set correctly:
$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"
and I am using vim with set encoding=utf-8
and set fileencodings=utf-8
included in my .vimrc
file.
This might actually be a vim
problem. I used nano
to write this script from scratch and name it hello2.go
and the print out actually prints Hello, 世界
correctly. But the original hello.go
script, created with vim
, only gives me the gibberish Hello, 世界
.
So just double check that my vim-created hello.go
script is a UTF-8 Unicode text, I run the file
command on it. Like this:-
$ file hello.go
hello.go: C source, UTF-8 Unicode text
So what gives? Why does my vim created hello.go
script above print out gibberish but my nano created hello2.go
(which contains the same exact lines of code) does not?
$ file hello2.go
hello2.go: C source, UTF-8 Unicode text
In fact, when I open vim-created hello.go
with nano
, the source code now reads:-
package main
import "fmt"
func main() {
fmt.Println("Hello 世界")
}
But if I open the same vim-created hello.go
with vim
, the source code reads:-
package main
import "fmt"
func main() {
fmt.Println("Hello, 世界")
}
Why is this so?
These are the offending lines in my .vimrc
that is causing me this problem.
if has("gui_running")
set guitablabel=%t%=%m "Set the label of the tabs
set nomacatsui anti enc=utf-8 tenc=macroman gfn=Monaco:h11
" set window size
set lines=40
set columns=120
else
set enc=utf-8 tenc=macroman gfn=Monaco:h11
set fenc=utf-8
endif
Specifically, tenc=macroman
is screwing up my encoding at terminal level.
Switched it to tenc=utf-8
and all is good.
Wasted 4 hours of my life on this I-should-have-seen-this-coming problem! Ugh.