I need to write a script that connects an ERP program to a manufacturing program. With the production program the matter is clear - I send it data via HTTP requests. It is worse with the ERP program, because in its case, the data must be read from a DBF file.
I use the dbf library because (if I'm not mistaken) it's the only one that provides the ability to filter data in a fairly simple and fast way. I open the database this way
table = dbf.Table(path).open()
dbf_index = dbf.pql(table, "select * where ident == 'M'")
I then loop through each successive record that the query returned. I need to "package" the selected data from the DBF database into json and send it to the production program api.
data = {
"warehouse_id" : parseDbfData(record['SYMBOL']),
"code" : parseDbfData(record['SYMBOL']),
"name" : parseDbfData(record['NAZWA']),
"main_warehouse" : False,
"blocked" : False
}
The parseDbfData function looks like this, but it's not the one causing the problem because it didn't work the same way without it. I added it trying to fix the problem.
def parseDbfData(data):
return str(data.strip())
When run, if the function encounters any "mismatching" character from DBF database (e.g. any Polish characters i.e. ą, ę, ś, ć) the script terminates with an error
UnicodeDecodeError: 'ascii' codec can't decode byte 0x88 in position 15: ordinal not in range(128)
The error points to a line containing this (in building json)
"name" : parseDbfData(record['NAZWA']),
The value the script is trying to read at this point is probably "Magazyn materiałów Podgórna". As you can see, this value contains the characters "ł" and "ó". I think this makes the whole script break but I don't know how to fix it.
I'll mention that I'm using Python version 3.9. I know that there were character encoding issues in versions 2., but I thought that the Python 3. era had remedied that problem. I found out it didn't :(
I came to the conclusion that I have to use encoding directly when reading the DBF database. However, I could not read from the documentation, how exactly to do this.
After a thorough analysis of the dbf module itself, I came to the conclusion that I need to use the codepage parameter when opening the database. A moment of combining and I was able to determine that of all the encoding standards available in the module, cp852 suits me best.
After the correction, the code to open a DBF database looks like this:
table = dbf.Table(path, codepage='cp852').open()