Search code examples
pythonjsonexcelcharacter-encodingiso-8859-1

'utf8' codec can't decode byte 0xf3


I am using python 2.7 to read a JSON file. My code is:

import json
from json import JSONDecoder
import os

path = os.path.dirname(os.path.abspath(__file__))+'/json'
print path

for root, dirs, files in os.walk(os.path.dirname(path+'/json')):
    for f in files:  
        if f.lower().endswith((".json")):
            fp=open(root + '/'+f)
            data = fp.read()
            print data.decode('utf-8')

But I got the following error:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xf3 in position 72: invalid continuation byte

Solution

  • Your file is not encoded in UTF-8, and the error occurs at the fp.read() line. You must use:

    import io
    io.open(filename, encoding='latin-1')
    

    And the correct, not platform-dependent usage for joining your paths is:

    os.path.join(root, f)