Search code examples
pythonunicodetravis-cinosesix

Missing u-strings on Python 3.2?


I have a litany of unit tests that are run on Travis CI and only on PY3.2 it goes belly up. How can I solve this without using six.u()?

def test_parse_utf8(self):
    s = String("foo", 12, encoding="utf8")
    self.assertEqual(s.parse(b"hello joh\xd4\x83n"), u"hello joh\u0503n")

======================================================================
ERROR: Failure: SyntaxError (invalid syntax (test_strings.py, line 37))
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.2.5/lib/python3.2/site-packages/nose/failure.py", line 39, in runTest
    raise self.exc_val.with_traceback(self.tb)
  File "/home/travis/virtualenv/python3.2.5/lib/python3.2/site-packages/nose/loader.py", line 414, in loadTestsFromName
    addr.filename, addr.module)
  File "/home/travis/virtualenv/python3.2.5/lib/python3.2/site-packages/nose/importer.py", line 47, in importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/home/travis/virtualenv/python3.2.5/lib/python3.2/site-packages/nose/importer.py", line 94, in importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File "/home/travis/build/construct/construct/tests/test_strings.py", line 37
    self.assertEqual(s.build(u"hello joh\u0503n"), b"hello joh\xd4\x83n")
                                               ^
SyntaxError: invalid syntax

Trying to get this to work:

PY3 = sys.version_info[0] == 3
def u(s): return s if PY3 else s.decode("utf-8")

self.assertEqual(s.parse(b"hello joh\xd4\x83n"), u("hello joh\u0503n"))

Quote from https://pythonhosted.org/six/

On Python 2, u() doesn’t know what the encoding of the literal is. Each byte is converted directly to the unicode codepoint of the same value. Because of this, it’s only safe to use u() with strings of ASCII data.

But the whole point of using unicode is to not be restricted to ASCII.


Solution

  • I think you're out of luck here.

    Either use six.u() or drop support for Python 3.2.