Search code examples
pythondjangounicodeformatting

Unable to encode/decode pprint output


This question is based on a side-effect of that one.

My .py files are all have # -*- coding: utf-8 -*- encoding definer on the first line, like my api.py

As I mention on the related question, I use HttpResponse to return the api documentation. Since I defined encoding by:

HttpResponse(cy_content, content_type='text/plain; charset=utf-8')

Everything is ok, and when I call my API service, there are no encoding problems except the string formed from a dictionary by pprint

Since I am using Turkish characters in some values in my dict, pprint converts them to unichr equivalents, like:

API_STATUS = {
    1: 'müşteri',
    2: 'some other status message'
}

my_str = 'Here is the documentation part that contains Turkish chars like işüğçö'
my_str += pprint.pformat(API_STATUS, indent=4, width=1)
return HttpRespopnse(my_str, content_type='text/plain; charset=utf-8')

And my plain text output is like:

Here is the documentation part that contains Turkish chars like işüğçö

{
    1: 'm\xc3\xbc\xc5\x9fteri',
    2: 'some other status message'
}

I try to decode or encode pprint output to different encodings, with no success... What is the best practice to overcome this problem


Solution

  • pprint appears to use repr by default, you can work around this by overriding PrettyPrinter.format:

    # coding=utf8
    
    import pprint
    
    class MyPrettyPrinter(pprint.PrettyPrinter):
        def format(self, object, context, maxlevels, level):
            if isinstance(object, unicode):
                return (object.encode('utf8'), True, False)
            return pprint.PrettyPrinter.format(self, object, context, maxlevels, level)
    
    
    d = {'foo': u'işüğçö'}
    
    pprint.pprint(d)              # {'foo': u'i\u015f\xfc\u011f\xe7\xf6'}
    MyPrettyPrinter().pprint(d)   # {'foo': işüğçö}