Search code examples
pythonpython-2.7decodingdbf

DBF Import Charmap Error Python


I have another question going regarding an import but I ran into a different issue. I'm trying to import data from a DBF file, and while most DBF files work, I ran into a one that's giving me the following error,

"C:\Program Files\Anaconda2\python.exe" D:/Projects/DBFImport/DBFImporter/extractdbf.py
Traceback (most recent call last):
  File "D:/Projects/DBFImport/DBFImporter/extractdbf.py", line 17, in <module>
for record in table.records:
  File "C:\Program Files\Anaconda2\lib\site-packages\dbfread\dbf.py", line 316, in _iter_records
for field in self.fields]
  File "C:\Program Files\Anaconda2\lib\site-packages\dbfread\field_parser.py", line 79, in parse
return func(field, data)
  File "C:\Program Files\Anaconda2\lib\site-packages\dbfread\field_parser.py", line 157, in parseM
return self.decode_text(memo)
  File "C:\Program Files\Anaconda2\lib\site-packages\dbfread\field_parser.py", line 45, in decode_text
return decode_text(text, self.encoding, errors=self.char_decode_errors)
  File "C:\Program Files\Anaconda2\lib\encodings\cp1252.py", line 15, in decode
return codecs.charmap_decode(input,errors,decoding_table)
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 278: character maps to <undefined>

Here's the code, it's very simple for easy analysis,

import pyodbc, os, string
from dbfread import DBF

# SQL Server Connection Test
cnxn = pyodbc.connect('DRIVER={SQLServer};SERVER=***********;DATABASE=TEST_DBFIMPORT;UID=test;PWD=test')
cursor = cnxn.cursor()

table = DBF('E:\\Backups\\imp.dbf', lowernames=True)
for record in table.records:
    rec1 = record['id']
    cursor.execute ("insert into imp(ID) values(?)", rec1)
cnxn.commit()

I've tried all kinds of decoding but nothing seems to work.

Update1:

<type 'tuple'>: (<type 'exceptions.UnicodeDecodeError'>, UnicodeDecodeError('charmap', 'Firearms as appraised on May 18, 2011. F.I.E (Firearms Import Export Co.) .26 automatic pistol S/N # AS21212 ----------------- $175.00 Walther (Smith & Wesson) P22, 22LR semi automatic pistol S/N # N052010 -------------- $325.00 Taurus .357 Magnum Model 608 revolver, blue,, 4\xe2\x80\x9d vent rib barrel S/N # LF632765 ------------ $375.00 Colt MKII Series 70 semi automatic pistol, 9mm, blue, pacmeyer grips, S/N # 70S49671 -------- $475.00 Ruger Model 10/22 semi-automatic carbine, 22LR, S/N # 126-90774 ----- $200.00', 278, 279, 'character maps to <undefined>'), None)

Solution

  • You are getting the error because there a few code points (three, I think) that do not have a unicode mapping -- they're just blank.

    Using my dbf library you would normally open the file as:

    table = dbf.Table('e:/Backups/imp.dbf')  # forward slash and backslash both work
    

    You can see the file encoding specified by the table itself by printing the table:

    print table
    

    To override the encoding specified in the table itself:

    table = dbf.Table('e:/Backups/imp.dbf', codepage='...')
    

    If nothing else works you can try using 'utf8' for the code page -- it's not part of the dbf spec but may help (I added it for my own use, so nothing guarantied/warrantied etc.).