I am trying to load a large .xlsx
file using OpenPyXL
, while loading a 80 MB .xlsx file my entire 8 GB of memory is getting full
Excel file has 4 sheets with 800 000 rows.
from openpyxl import load_workbook
wb = load_workbook('Record.xlsx')
def process(ws):
'''
Read all rows of a worksheet
'''
data = []
for a, c, f, k in zip(ws['A'], ws['C'], ws['F'], ws['K']):
data.append([a, c, f, k])
return data
ws1 = wb.worksheets[0] # Sheet 1
data1 = process(ws1)
ws2 = wb.worksheets[1] # Sheet 2
data2 = process(ws2)
ws3 = wb.worksheets[2] # Sheet 3
data3 = process(ws3)
ws4 = wb.worksheets[3] # Sheet 4
data4 = process(ws4)
Why while loading 80 MB of excel file 8 GB of memory is not enough?
Try using the read_only = True
argument to use read only mode, as described here.