Search code examples
pythonexcelcsvxlrd

Error: Unsupported format, or corrupt file: Expected BOF record


I am trying to open a xlsx file and just print the contents of it. I keep running into this error:

import xlrd
book = xlrd.open_workbook("file.xlsx")
print "The number of worksheets is", book.nsheets
print "Worksheet name(s):", book.sheet_names()
print

sh = book.sheet_by_index(0)

print sh.name, sh.nrows, sh.ncols
print

print "Cell D30 is", sh.cell_value(rowx=29, colx=3)
print

for rx in range(5):
    print sh.row(rx)
    print

It prints out this error

raise XLRDError('Unsupported format, or corrupt file: ' + msg)
xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found    '\xff\xfeT\x00i\x00m\x00'

Thanks


Solution

  • The error message relates to the BOF (Beginning of File) record of an XLS file. However, the example shows that you are trying to read an XLSX file.

    There are 2 possible reasons for this:

    1. Your version of xlrd is old and doesn't support reading xlsx files.
    2. The XLSX file is encrypted and thus stored in the OLE Compound Document format, rather than a zip format, making it appear to xlrd as an older format XLS file.

    Double check that you are in fact using a recent version of xlrd. Opening a new XLSX file with data in just one cell should verify that.

    However, I would guess the you are encountering the second condition and that the file is encrypted since you state above that you are already using xlrd version 0.9.2.

    XLSX files are encrypted if you explicitly apply a workbook password but also if you password protect some of the worksheet elements. As such it is possible to have an encrypted XLSX file even if you don't need a password to open it.

    Update: See @BStew's, third, more probable, answer, that the file is open by Excel.