When I read an .xls file, xlrd acts as if there was an empty column at position 0, and the last column (number 58, if counting from 0) is never read; the first column is skipped and the last one disappears.
The issue goes away when I save the file (either as an .xls or .xlsx) and do the whole process again. When saving, Excel tells me that there may be compatibility issues. I need to automate this process, which is why I'd like to avoid manually having to save the file.
# Loading the 'converted' file
wb = xlrd.open_workbook(converted_file_path)
sheet = wb.sheet_by_index(0)
print('Row',len(sheet.row(3)),sheet.row(3))
# Prints: Row 59 [text:'Client Number', ...
# Loading the original file
wb = xlrd.open_workbook(original_file_path)
print('Row',len(sheet.row(3)),sheet.row(3))
# Prints: Row 58 [empty:'', text:'Client Number', ...
print(sheet.cell_value(0,5))
# Prints nothing, while there is a number at cell (0,5)
Thanks to Martin for suggesting the win32com
fix, but I found the actual problem. It has something to do with how xlrd
has Python load up sheets in memory. Perhaps messing around with the on_demand
parameter while loading the workbook does something, but anyhow, here's what was happening:
I needed the sheet named "Report X" where X is some number, so I looked for it like this:
for sheet in wb.sheets():
if 'Report' in sheet.name:
report_sheet = sheet
break
Then, I changed it to
for sheet in wb.sheets():
if 'Report' in sheet.name:
report_sheet = sheet
# removed the break
And now it works.