Search code examples
pythonperformanceopenpyxlxlsm

How to improve openpyxl writing values in cells?


I have a template .xlsm file with macros and a basic structure. I need to write inside the file some user object that I have in an array. Since there are many columns that aren't used (or don't have a value corresponding in the user class), I need to write only specific cells for each row.

I came out with a basic loop like this:

wb2 = openpyxl.load_workbook(XLSM_TEMPLATE_PATH, keep_vba=True)
ws2 = wb2['Datas']
row_num = 4

i = 0
for user in users:
    current_row = row_num + i
    ws2.cell(row=current_row, column=21).value = user.last_name
    ws2.cell(row=current_row, column=22).value = user.first_name
    ws2.cell(row=current_row, column=25).value = user.tax_code
    ws2.cell(row=current_row,
             column=30).value = user.residence_address.street_address + ', ' + user.residence_address.street_number
    ws2.cell(row=current_row, column=31).value = user.residence_address.city_name
    ws2.cell(row=current_row, column=34).value = user.email
    ws2.cell(row=current_row, column=38).value = user.date_of_birth
    ws2.cell(row=current_row, column=39).value = user.place_of_birth

    i += 1

exported_file_path = EXPORT_PATH.format(generate_random_code('.xlsm'))
wb2.save(exported_file_path)
wb2.close()

The problem is that looping for 3-4 users takes about 15 second of processing time, so the Amazon lambda function (that hosts the script) times out and fails... and also increasing the work time isn't a big deal since will increase the cost too much.

There is any solution to speed up the process?


Solution

  • After different researches the only thing that can be done to improve the performance is to change the library with a more optimized one. I obtained better results with Pandas. It's still not "fast" but I got around -20% in computation time.