I have a storage account (adls gen2) with several excel files, and I can print the list of them in synapse notebooks like this:
mssparkutils.fs.ls("abfss://container@acct.dfs.core.windows.net/excelfolder/")
And I can clearly see my file path for the .xlsx listed, However when I try to open a sheet it cannot find the file:
from openpyxl import load_workbook
workbook = load_workbook("abfss://container@acct.dfs.core.windows.net/excelfolder/excelfile.xlsx")
sheet = workbook.worksheets[0]
sheet
FileNotFoundError: [Errno 2] No such file or directory: 'abfss://container@acct.dfs.core.windows.net/excelfolder/excelfile.xlsx'
I have tried to use the https link and link with a sas token and it's still not able to open the file/sheet1.
How do I open the sheet or is there a better way to do this from a synapse perspective? I need to dynamically (without sas tokens) open multiple excel files and union sheets together.
The suggested duplicate has NOTHING to do with azure, synapse nor does it answer my question.
When I tried to replicate the issue in my environment, I got the same error:
Openyxl workbook expects only local file path that's the reason for the above error. I have given file path format as mentioned below:
df=pd.read_excel('https://<storageAccountName>.blob.core.windows.net/<containerName>/<directory>/samp.xlsx<sasToken>',engine="openpyxl",sheet_name = '<sheetName>')
I was able to solve the error using below code:
import pandas as pd
df=pd.read_excel('https://<storageAccountName>.blob.core.windows.net/<containerName>/<directory>/samp.xlsx<sasToken>',engine="openpyxl",sheet_name = '<sheetName>')
print(df)
Otherwise you can Mount the file path and read it in synapse notebook.