Search code examples
pythonpython-3.xdecodeencodepython-3.3

Python decode and encode with utf-8


I am trying to encode and decode with utf-8. What is wierd is that I get an error trackback saying that I am using gbk.

oneword.decode("utf-8")]

below is the error trackback.

UnicodeEncodeError: 'gbk' codec can't encode character '\u2769' in position 1: illegal multibyte sequence

Can anyone tell me what to do? I seems that the decode parameter does not have effect.


Solution

  • I got it solved. Actually, I intended to output to a file instead of the console. In such situation, I have to explicitly indicate the decoding of the output target file. Instead of using open I used codecs.open.

    import codecs
    
    f = codecs.open(filename, mode='w', encoding='utf-8')
    

    Thanks to @Bakuriu from the comments:

    If you are using Python 3 you no longer need to import the codecs module. Just pass the encoding parameter to the built-in open function.