Search code examples
pythonunicodeutf-8sqlalchemylatin1

SQLAlchemy Returning UTF-8 as Latin1 Strings


I have a MySQL database encoded in UTF-8, but when I connect to it with SQLAlchemy (Python 2.7), I get back strings with Latin1 Unicode characters in them.

So, the Dutch spelling of Belgium (België) comes out as

'Belgi\xeb'

rather than

'Belgi\xc3\xab'

or, ideally the Unicode object

u'Belgi\xeb'

Solution

  • According to docs (http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html#custom-dbapi-args):

    MySQLdb will accommodate Python unicode objects if the use_unicode=1 parameter, or the charset parameter, is passed as a connection argument.

    Without this setting, many MySQL server installations default to a latin1 encoding for client connections.

    You need to use

    create_engine('mysql+mysqldb://HOSTNAME/DATABASE?charset=utf8')
    

    rather than just

    create_engine('mysql+mysqldb://HOSTNAME/DATABASE')