Search code examples
pythonperformancepython-3.xdbfxbase

Python: Fast querying in a big dbf (xbase) file


I have a big DBF file (~700MB). I'd like to select only a few lines from it using a python script. I've seen that dbfpy is a nice module that allows to open this type of database, but for now I haven't found any querying capability. Iterating through all the elements from python is simply too slow.

Can I do what I want from python in a reasonable time?


Solution

  • Using my dbf module you can create temporary indexes and then search using those:

    import dbf
    
    table = dbf.Table('big.dbf')
    index = table.create_index(lambda rec: rec.field) # field should be actual field name
    
    records = index.search(match=('value',))
    

    Creating the index may take a few seconds, but the searches after that are extremely quick.