I am in search of a file-sharing solution within the Azure ecosystem of tools/services.
The current need is to be able to write thousands of files (3-4 thousand per week) from a script that runs in Databricks, to a storage solution that allows for access from a few other non-technical users. The script that generates the reports is a Python script, not PySpark, although it does run in databricks (a number of PySpark jobs precede it). The storage solution must allow for:
1) writing/saving excel and html files from Python
2) users to view and download multiple files at a time (I believe this knocks out Blob storage?)
Thanks!
Thank you for sharing your question. If
Azure does offer a data-share service you can use. Azure Data Share can let you separate the store your Python script writes to, from the store your non-technical users read from.
For point number 1, I do not see any issues. The storage solutions on Azure are mostly file-type agnostic. It is technically possible to write to any of the storage solutions, the main difference is how easy or long the process is to do so.
In point number 2, I think what you are hinting at, is the ease with which your non-technical people can access the storage. It is possible to download multiple files at a time from Blob storage, but the the Portal may not be the most user-friendly way to do this. I recommend you look into Azure Storage Explorer. Azure Storage Explorer provides one client application with which your users can manage or download the files from all the Azure Storage solutions.
Given how you specified html files, and viewing multiple files at a time, I suspect you want to render the files like a browser. Many resources have a URI. If a self-contained html file is made publicly accessible in Blob storage or ADLS gen2, and you navigate to it in a browser, the html page will render.