I am porting my code to python 3 with maintaining backwards compatibility.
The str
function in python 2 and python 3 convert strings with non-ascii characters differently. For example:
Python 2:
In [4]: str('Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. Löve & D. Löve')
Out[4]: 'Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. L\xc3\xb6ve & D. L\xc3\xb6ve'
But in Python 3:
In [1]: str('Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. Löve & D. Löve')
Out[1]: 'Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. Löve & D. Löve'
Python 3 How can I get the same representation in Python 2? I am writing the strings to a sqlite3 table.
It appears what you want is a unicode string literal. In Python 3, all normal string literals are unicode string literals. In Python 2, only unicode
values are unicode strings. Creating a unicode string literal in Python 2 is accomplished by putting a u
in front of the literal:
u'Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. Löve & D. Löve'
This is the same representation as your Python 3 string. Note that if your source file is in UTF-8 encoding, you need to add a special comment to indicate this, on the first or second line, such as:
# -*- coding: utf-8 -*-
For more information on this, see PEP 263 or this other question.