I want to write a java programm that reads a DBF file which contains german letters like 'ö'. The problem I am facing is that I don't know which encoding the file uses. When I open notepad++ or the Windows editor, it says ANSI. But both programms show the 'ö' as '”'. But when I open Excel the 'ö' is shown.
I also tryed to change the encoding in notepad++, but nothing worked. Does someone know a way to see which encoding Excel is currently using/which encoding the file uses?
Strings in your .DBF
file are encoded as cp850
(although any of ['cp1026', 'cp437', 'cp775', 'cp850', 'cp852', 'cp857', 'cp858', 'cp861', 'cp865', 'cp895']
could apply and hard to guess from given isolated example).
Explanation:
You face a mojibake case (example in Python for its universal intelligibility):
'ö'.encode('cp850').decode('cp1252')
'”'
BTW, opening a .dbf
file in a text editor gives no sense because it's a binary one (see .dbf
header structure). Hence, any algorithm guessing text encoding (like Notepad++
's one) must fail…
Further reading (DBase
/FoxPro
never escaped from limitations of 8-bit encoding):