Search code examples
csvpython-unicodepython-3.6

python 3 read csv UnicodeDecodeError


I have a very simple bit of code that takes in a CVS and puts it into a 2D array. It runs fine on Python2 but in Python3 I get the error below. Looking through the documentation,I think I need to use .decode() Could someone please explain how to use it in the context of my code and why I don't need to do anything in Python2

Error: line 21, in for row in datareader: File "/usr/lib/python3.6/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9 in position 5002: invalid start byte

import csv 
import sys 

fullTable = sys.argv[1]

datareader = csv.reader(open(fullTable, 'r'), delimiter=',') 
full_table = [] 
for row in datareader:
        full_table.append(row)

print(full_table)

Solution

  • open(argv[1], encoding='ISO-8859-1')
    

    CSV contained characters where were not UTF-8 which seemed like the default. I am however surprised that python2 dealt with this issue without any problems.