I have a folder with a bunch of dbf files I would like to convert to csv. I have tried using a code to just change the extension from .dbf to .csv, and these files open fine when I use Excel, but when I open them in pandas they look like this:
s\t�
0 NaN
1 1 176 1.58400000000e+005-3.385...
This is not what I want, and those characters don't appear in the real file.
How should I read in the dbf file correctly?
Looking online, there's a few options:
With simpledbf:
dbf = Dbf5('fake_file_name.dbf')
df = dbf.to_dataframe()
Tweaked from the gist:
import pysal as ps
def dbf2DF(dbfile, upper=True):
"Read dbf file and return pandas DataFrame"
with ps.open(dbfile) as db: # I suspect just using open will work too
df = pd.DataFrame({col: db.by_col(col) for col in db.header})
if upper == True:
df.columns = map(str.upper, db.header)
return df