Search code examples
mysqlgoencodingiso-8859-1

MySQL encoding problem when inserting with Go driver


I'm trying to store utf-8 text into a table which encoding is latin1_swedish_ci. I can't change the encoding since I do not have direct access to the the db. So what I'm trying is encode the text into latin-1 with this Go library that provides the encoder and this one that has a function that wraps the encoder so it replaces the invalid characters instead of returning an error.

But when I try to insert the row mysql complains Error 1366: Incorrect string value: '\\xE7\\xE3o pa...' for column 'description' at row 1.

I tried writing the same text to a file and file -I reports this file.txt: application/octet-stream; charset=binary.

Example

package main

import (
    "fmt"
    "os"

    "golang.org/x/text/encoding"
    "golang.org/x/text/encoding/charmap"
)

func main() {
    s := "foo – bar"

    encoder := charmap.ISO8859_1.NewEncoder()
    encoder = encoding.ReplaceUnsupported(encoder)

    encoded, err := encoder.String(s)
    if err != nil {
        panic(err)
    }

    fmt.Println(s)
    fmt.Println(encoded)
    fmt.Printf("%q\n", encoded)

    /* file test */
    f, err := os.Create("file.txt")
    if err != nil {
        panic(err)
    }
    defer f.Close()

    w := encoder.Writer(f)
    w.Write([]byte(s))
}
    

I'm probably missing something very obvious but my knowledge about encodings is very poor.

Thanks in advace.


Solution

  • Were you expecting çã ?

    The problem is easily solved. MySQL will gladly translate from latin1 to utf8 while INSERTing text. But you must tell it that your client is using latin1. That is probably done during the connection to MySQL, and is probably defaulted to utf8 or UTF-8 or utf8mb4 currently. It is something like

    charset=latin1