Search code examples
pandasopenpyxlpandas.excelwriter

pandas exlewriter.book does not read my excel file and even break the existed file


I want to stack a series of dataframe in one excel file and I wrote the code below.

if os.path.isfile(result) is False:
    with pd.ExcelWriter(result, engine='openpyxl') as writer:
        raw_data.to_excel(writer,sheet_name=sheet1, index=False)      
else:
    with pd.ExcelWriter(result, engine='openpyxl') as writer:
        writer.book=openpyxl.load_workbook(result)
        raw_data.to_excel(writer, sheet_name=sheet2, index=False)

The initial creation of the excel file works fine, but after the file had been created, the error occurred at the step writer.book=openpyxl.load_workbook(result) like below.

zipfile.BadZipFile: File is not a zip file

During handling of the above exception, another exception occurred:

IndexError: At least one sheet must be visible

And, after this error, the size of existed file had changed and file did not open.

Before:

xlsx file before

After:

xlsx file after

Is there any idea to solve this problem?


There are some ambiguities in my previous code, and the code below might be much clear

import pandas as pd
import os
import openpyxl as oxl

resultpath=r'D:/CODE/result.xlsx'

for i in range(5):
    sheet = 'Sheet'+ str(i)
    result = pd.DataFrame({'a': range(10), 'b': range(10)})
    if os.path.isfile(resultpath) is False:
        with pd.ExcelWriter(resultpath, engine='openpyxl') as writer:
            result.to_excel(writer,sheet_name=sheet, index=False)
             
    else:
        with pd.ExcelWriter(resultpath, engine='openpyxl') as writer:
            writer.book =oxl.load_workbook(resultpath)
            result.to_excel(writer, sheet_name=sheet, index=False)

the version of each package is presented below.

Pandas : 1.4.2 openpyxl : 3.0.9


Solution

  • I found a solution to this problem

    Using the 'mode' property of ExcelWriter solves this problem and also makes code easy to read.

    The modified code is below and it works fine

    import pandas as pd
    import os
    
    result = pd.DataFrame({'a': range(10), 'b': range(10)})
    
    resultpath=r'D:/CODE/test.xlsx'
    
    for i in range(5):
        sheet = 'Sheet'+ str(i)
        if not os.path.exists(resultpath):   
            with pd.ExcelWriter(resultpath, mode='w', engine='openpyxl') as writer:
                result.to_excel(writer, index=False, sheet_name=sheet)
        else:   
            with pd.ExcelWriter(resultpath, mode='a', engine='openpyxl') as writer:
                result.to_excel(writer, index=False, sheet_name=sheet)