I am trying to change a number in an excel file (and eventually multiple excel files by putting it in a loop). I want to edit the file and save it as a new file, which I have done successfully. The problem is the new file that I save strips all of the formatting that the old excel file had. Correct me if I'm wrong: but I can't use Openpyxl because it only works for .xlsx
files (all of the files I'm working with are .xls
).
I've looked into pandas but was unsuccessful in finding a solution. I'm most familiar with xlrd
and xlwt
, but am willing to try any other libraries if it solves the problem.
import xlrd
from xlutils.copy import copy
from xlrd import *
# To open Workbook
loc = (r"X:\Projects\test.xls")
wb = xlrd.open_workbook(loc)
dose = wb.sheet_by_index(13)
manu = dose.cell_value(4,3)
#writing 675
w = copy(open_workbook(loc))
if manu == "Hologic":
w.get_sheet(13).write(5,3,675)
w.save('book2.xls')
Again, the code works without any errors. But the new .xls
file has no formatting. The formatting is crucial for this project, so I can't lose any of it.
You probably can't.
Microsoft created xlsx
files for a reason: the classic xls
format is a legacy binary file piling up hundreds, maybe thousands, of features, each reprented in differing ways (and the file format was not even openly documented back then, I don't know if it is now). So there is one app that can open a xls file and guarrantee to present what is there with all the features intended by the file creator: Excel. And the same Excel version that created the file, in that.
So, any open library that can write to xls
will create the most basic files, with no formatting - and be lucky if it can parse out the content parts.
xlsx
files on the other hand use conforming xml
files internally, and even a program that does not care to know about the full specs can change information in the file and preserve formatting and other things simply by not touching anything it does not know about, and assembling a valid xml
again.
That said, if you can't convert to xlsx
, maybe the easier thing to do is use Python to drive Excel itself to make the changes for you, in an automated way.
The documentation for that is few and far apart, but that is possible by using pywin32 and the "COM" api - take a look here for a start: https://pbpython.com/windows-com.html
Another option is using LibreOffice - it can read and write xls files with formatting (though surely with losses), and is scriptable in Python. Unfortunatelly, the information on how to script LibreOffice using Python to do that is also hard to find, and the legacy option of using their "UNO" thing to enable interaction with Python makes its use complicated.