Search code examples
pythonmysqlmysql-pythonmysql-connectormysql-connector-python

How do I handle non-latin characters (i.e. С крыш наших домов) in MySQL?


I'm using the mysql-connector-python library to connect and write to a MySQL 5.7 db. I've set the encoding to utf8mb4 with cursor.execute('SET CHARACTERS SET utf8mb4'), and even included it in my connect settings:

import mysql.connector
from mysql.connector import Error

sg_titles_db_settings = {
    'user': <user>,
    'password': <password>,
    'host': <host>,
    'port': <port>,
    'database': <db>,
    'charset': 'utf8'
}

def get_mysql_connection():
    try:
        db_connection = mysql.connector.connect(**sg_titles_db_settings)
        return db_connection
    except Error as e:
        print("Error: ", e)
        return False

But any non-latin (non-english letters), such as any eastern european alpha character or special symbols ( for example), is inserted as ?.

Here's the error I receive if I don't change the encoding:

1366 (HY000): Incorrect string value: '\xD0\x9E\xD1\x82\xD0\xB2...' for column...

I don't understand what I need to do in order to resolve this issue. Every article I stumble upon doesn't seem to help.

Thanks in advance!


Solution

  • In the MySQL server run this command SET character_set_results=utf8;, that should fix it. However, closing the server may not persist that change.

    I'm working with docker and it does not persist. The only way to persist the encoding change is to include it in the docker-compose.yml file:

    services:
        <db name>:
            environment:
                LANG: C.UTF-8