Search code examples
pythondbf

python3 dbf module. trounle with appen cyrillic latters


trying to make dbf, all goes as normal but if im try to append cyrillic latters:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-6: ordinal not in range(128)

im know that python have probles with unicode but may be somehow im cant put cyrillic latters in dbf?

code:

table = dbf.Table(ex_file_name)
table.open(mode=dbf.READ_WRITE)
for r in rows_massive:
    table.append(
        (datetime.strptime(r[0], '%d.%m.%Y'), r[1], r[2], PLACEPAY, prefix_name))

PLACEPAY pay have cyrillic latters in position 0-6


Solution

  • The problem is the dbf was not created with a code-page, so it defaulted to ASCII. You can try creating the table with code page 866 (Russian). If creating with the dbf1 module, it looks like this:

    table = dbf.Table('filename.dbf', 'field1 D, field2 C(10), ...,'  codepage='cp866')
    

    If you cannot create the dbf yourself, but whatever other software you are using is broken enough to read non-ASCII data in an ASCII-specified dbf file, then you can simply override the code page whenever you open the table in Python (it's the same as above, but without the field specifications):

    table = dbf.Table('filename.dbf', codepage='cp866')
    

    Alternatively, if you are only using Python, and are only using the dbf module, you can try the undocumented and incompatible-with-other-dbf-libraries code page of 'utf8' -- if you do you will want to make your character fields bigger, since the number of bytes needed to represent certain Unicode code-points is greater than one (worst case scenario is four bytes per code-point, so the safe route would be to increase the size of your character fields four times; i.e. a C(6) field would be C(24)).


    1 Disclosure: I am the author of the dbf module.