Search code examples
pythonperformancebibtex

Is there an efficient way to load a large bibtex (37000 bibtex entries) file in python?


In my python application, I load about 37'000 BibTeX entries.

The following chunk of code loads the .txt file as bibtex file, but it takes a lot of time to load the file contents for further processing. Is there a way do it more efficiently?

with open('/home/usr/Downloads/bibtexFile.txt') as bibtex_file:
    bibtex_str = bibtex_file.read()

bib_database = bibtexparser.loads(bibtex_str)

Solution

  • Try this using the biblib ==0.1.3. The file stats.bib contains uniquely formated bibtex entries.

    from pybtex.database.input import bibtex
    parser = bibtex.Parser()
    bib_data = parser.parse_file('stats.bib')
    print (bib_data.entries)