Search code examples
javascriptnode.jswebassemblypyodide

Pyodide unable to read xlsx files into pandas


I'm trying to load xlsx files using pyodide. I able to load xls files correctly using the following code:

response = await fetch('${fileUrl}')
js_buffer = await response.arrayBuffer()
dFrame = pd.read_excel(BytesIO(js_buffer.to_py()))

However, it fails when I try to pass an xlsx file. I am not sure what's causing it. Here are potential issues that I already struck-out after testing:

  1. "openpyxl" not properly loaded - I used micropip to load openpyxl, and pd.read_excel('filename.xlsx') works when I manually place a valid xlsx file in the memory. So this isn't the issue.

Thanks for the help.


Solution

  • So, I'm still not sure why directly read_excel or Excelfile functions don't work (especially since they just call openpyxl anyway), but I was able to get it to work by opening the file using openpyxl and then converting it to a dataframe. Working code below:

          response = await fetch('${fileUrl}')
          js_buffer = await response.arrayBuffer()
          wb = openpyxl.load_workbook(BytesIO(js_buffer.to_py()), data_only = True)
          ws = wb.worksheets[0]
          excel_data = ws.values
          columns = next(excel_data)[0:]
          dFrame = pd.DataFrame(excel_data , columns=columns)