Search code examples
pythonazuredatabricksazure-databricks

Read a .ipynb file in Azure Databricks Repos


I'm trying to read a .ipynb file into the repos of Azure Databricks, but encountering the following error:

error 95

Interestingly, other file types like xlsx seem to work just fine:

working

I've already tried several approaches to resolve it, including:

  • Using both the .ipynb extension in the file path and trying without it.
  • Changing the file permissions using chmod.
  • Copying the file to the tmp/ folder (when the file is in dbfs, I tried copying it to tmp/).
  • Attempting to use other libraries such as io and pathlib.
  • Copying the file from repos to dbfs.

Despite these efforts, I am still unable to achieve the desired outcome, which is to execute the following action as depicted in the image below:

i want

Desired Action

My suspicion is that Azure Databricks might be denying the read operation for security reasons.

Has anyone encountered a similar issue or have any ideas on how to resolve this?


Solution

  • Notebooks aren't exposed to the Workspace File System (WSFS). It's kind of design decision. If you need to obtain notebook's source code, then you can use Databricks Python SDK (recomemnded) or Export Workspace object REST API (cumbersome).

    import base64
    
    import databricks.sdk
    from databricks.sdk.service.workspace import ImportFormat
    
    w = databricks.sdk.WorkspaceClient()
    notebook = w.workspace.export("/Repos/..../notebook", 
       format=ImportFormat.JUPYTER)
    ipynb = base64.decodebytes(notebook.content.encode('ascii')).decode("utf-8")