python-2.7 unicode python-sphinx non-ascii-characters doctest

How can I test output of non-ASCII characters using Sphinx doctest?

I'm at a loss how to test printing output that includes non-ASCII characters using Sphinx doctest.

When I have test that include code that generates non-ASCII characters, or that contains expected results that include non-ASCII characters, I get encoding errors.

For example, if I have:

def foo():
    return 'γ'

then a doctest including

>>> print(foo())

will produce an error of the form

Encoding error: 'ascii' codec can't encode character u'\u03b3' in position 0: ordinal not in range(128)

as will any test of the form

>>> print('')
γ

Only by ensuring that none of my functions whose results I'm attempting to print, and none of the expected printed results, contain such characters can I avoid these errors. As a result I've had to disable many important tests.

At the head of all my code I have

# encoding: utf8
from __future__ import unicode_literals

and (in desperation) I've tried things like

doctest_global_setup =(
    '#encoding: utf8\n\n'
    'from __future__ import unicode_literals\n'
)

and

.. testsetup:: 
   from __future__ import unicode_literals

but these (of course) don't change the outcome.

How can I test output of non-ASCI characters using Sphinx doctest?

Solution

I believe it is due to your from __future__ import unicode_literals statement. print will implicitly encode Unicode strings to the terminal encoding. Lacking a terminal, Python 2 will default to the ascii codec.

If you skip an explicit print, it will work with or without import:

>>> def foo():
...  return 'ë'
...
>>> foo()
'\x89'

Or:

>>> from __future__ import unicode_literals
>>> def foo():
...  return 'ë'
...
>>> foo()
u'\xeb'

Then you can test for the escaped representation of the string.

You can also try changing the encoding of print itself with PYTHONIOENCODING=utf8.