I have some Characters in Farsi and I want to write them to a dbf file with my custom codepage which is 1 byte per character. I think this problem can be solved in one of these two ways:
1- Passing my custom codepage to the dbf table.
2- Writing binary data directly to the dbf file without using the default codepage of dbf package (which is utf8).
How can I solve this problem with either of these approaches?
Here is the code:
import dbf
man = 'مرد'
woman = 'زن'
row1 = (man, woman)
row2 = (man, woman)
with open('./file.dbf', 'w') as f:
table = dbf.Table(filename='./file.dbf',
field_specs='field1 C(3); field2 C(3)', codepage='customCodePage', on_disk=True)
table.open(dbf.READ_WRITE)
table.append(row1)
table.append(row2)
table.close()
After trying to register my codec I ended up translating my data from utf8 to "Custom Farsi codec" and then to equivalent character of windows-1256 that has the same decimal codepoint. So when the user reads the data with the custom codec, the windows-1256 characters will point to the right decimal in custom codec, of course characters in this raw form are not meaningful.
An example would be Letter پ in unicode has decimal codepoint of 1662 and in custom codec it has codepoint of 148. the equivalent of 148 codepoint in windows-1256 is ”. so the پ translates to ” using 3 different dictionaries. I did this for all characters in Farsi keyboard.