Search code examples
azureauthenticationazure-databricksmicrosoft-fabric

Error while writing data to MS Onelake from Azure databricks


As currently it is not possible to create a shortcut in oneLake for private enpoint enabled azure storage so I am exploring another way to get the data from secure datalake to oneLake. According to MS Documentation, It is possible to integrate azure databricks with onelake so I am reading the data from pe enabled adls and trying to write it to oneLake. While writing to oneLake, I am facing below error:

Operation failed: "Forbidden", 403, HEAD, https://onelake.dfs.fabric.microsoft.com/d36-47c2-83e3-676eec7d9/22f90e-f0d9-4559008060a/Files/RAW

I have used below code for it.

df_zipped = spark.read.format("parquet").option("compression", "gzip").option("header", True).load("/mnt/defined/measuremts-2019.gz.parquet")

oneLake = 'https://onelake.dfs.fabric.microsoft.com/d36-47c2-83e3-676eec7d9/22f90e-f0d9-4559008060a/Files/RAW'

df_zipped.write.format("csv").option("header", "true").mode("overwrite").csv(oneLake)

I understand that this is authorization issue so can any one of you please help me to understand how can I authenticate to oneLake from azure databricks.

I am accessing fabric from https://app.fabric.microsoft.com/ using Azure AD authentication so I am not sure how can I authenticate my identity from azure databricks to onelake.

Please advise.

Thanks


Solution

  • I believe you created the cluster by enabling the Enable credential passthrough for user-level data access option in Advanced options.

    enter image description here

    Next, you need to give ABFS path instead of https. To get ABFS path follow below steps.

    Go to your workspace.

    enter image description here

    Then select your lakehouse.

    enter image description here

    There click menu option on Files and select properties

    enter image description here

    you will get below paths.

    enter image description here

    Here, you copy ABFS path. Below is the data i am writing.

    enter image description here

    and writing it to onelake successfully.

    oneLake = "abfss://Your_workspace@msit-onelake.dfs.fabric.microsoft.com/Your_lake_house/Files/"
    df.write.format("csv").option("header", "true").mode("overwrite").csv(oneLake)
    

    enter image description here

    Output :

    enter image description here

    and again reading it from onelake.

    display(spark.read.csv(oneLake,header=True))
    

    enter image description here