using python 3.5.2 on windows (32), I'm reading a DBF file which returns me an OrderedDict.
from dbfread import DBF
Table = DBF('FME.DBF')
for record in Table:
print(record)
When accessing the first record all is ok until I reach a record which contains diacritics:
Traceback (most recent call last):
File "getdbe.py", line 3, in <module>
for record in Table:
File "...\AppData\Local\Programs\Python\Python35-32\lib\site-packages\dbfread\dbf.py", line 311, in _iter_records
for field in self.fields]
File "...\AppData\Local\Programs\Python\Python35-32\lib\site-packages\dbfread\dbf.py", line 311, in <listcomp>
for field in self.fields]
File "...\AppData\Local\Programs\Python\Python35-32\lib\site-packages\dbfread\field_parser.py", line 75, in parse
return func(field, data)
File "...\AppData\Local\Programs\Python\Python35-32\lib\site-packages\dbfread\field_parser.py", line 83, in parseC
return decode_text(data.rstrip(b'\0 '), self.encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x82 in position 11: ordinal not in range(128)
Even if I don't print the record I still have the problem.
Any idea ?
dbfread
failed to detect the correct encoding from your DBF file. From the Character Encodings section of the documentation:
dbfread
will try to detect the character encoding (code page) used in the file by looking at thelanguage_driver
byte. If this fails it reverts to ASCII. You can override this by passingencoding='my-encoding'
.
Emphasis mine.
You'll have to pass in an explicit encoding; this will invariably be a Windows codepage. Take a look at the supported codecs in Python; you'll have to use one that starts with cp
here. If you don't know what codepage to you you'll have some trial-and-error work to do. Note that some codepages overlap in characters, so even if a codepage appears to produce legible results, you may want to continue searching and trying out different records in your data file to see what fits best.