Search code examples
pythonexcelpandasopenpyxlxls

Outputting A .xls File In Python


I have been teaching myself Python to automate some of our work processes. So far reading from Excel files (.xls, .xlsx) has gone great.

Currently I have hit a bit of a snag. Although I can output .xlsx files fine, the software system that we have to use for our primary work task can only take .xls files as an input - it cannot handle .xlsx files, and the vendor sees no reason to add .xlsx support at any point in the foreseeable future.

When I try to output a .xls file using either Pandas or OpenPyXl, and open that file in Excel, I get a warning that the file format and extension of the file do not match, which leads me to think that attempting to open this file using our software could lead to some pretty unexpected consequences (because it's actually a .xlsx file, just not named as such)

I've tried to search for how to fix this all on Google, but all I can find are guides for how to convert a .xls file to a .xlsx file (which is almost the opposite of what I need). So I was wondering if anybody could please help me on whether this can be achieved, and if it can, how.

Thank you very much for your time


Solution

  • Under the pandas.DataFrame.to_excel documentation you should notice a parameter called engine, which states:

    engine : str, optional

    Write engine to use, openpyxl or xlsxwriter. You can also set this via the options io.excel.xlsx.writer, io.excel.xls.writer, and io.excel.xlsm.writer.

    What it does not state is that the engine param is automatically picked based on your file extension -- therefore, easy fix:

    import pandas as pd
    
    df = pd.DataFrame({"data": [1, 2, 3]})
    df.to_excel("file.xls") # Notice desired file extension.
    

    This will automatically use the xlwt engine, so make sure you have it installed via pip install xlwt.