As currently it is not possible to create a shortcut in oneLake for private enpoint enabled azure storage so I am exploring another way to get the data from secure datalake to oneLake. According to MS Documentation, It is possible to integrate azure databricks with onelake so I am reading the data from pe enabled adls and trying to write it to oneLake. While writing to oneLake, I am facing below error:
Operation failed: "Forbidden", 403, HEAD, https://onelake.dfs.fabric.microsoft.com/d36-47c2-83e3-676eec7d9/22f90e-f0d9-4559008060a/Files/RAW
I have used below code for it.
df_zipped = spark.read.format("parquet").option("compression", "gzip").option("header", True).load("/mnt/defined/measuremts-2019.gz.parquet")
oneLake = 'https://onelake.dfs.fabric.microsoft.com/d36-47c2-83e3-676eec7d9/22f90e-f0d9-4559008060a/Files/RAW'
df_zipped.write.format("csv").option("header", "true").mode("overwrite").csv(oneLake)
I understand that this is authorization issue so can any one of you please help me to understand how can I authenticate to oneLake from azure databricks.
I am accessing fabric from https://app.fabric.microsoft.com/ using Azure AD authentication so I am not sure how can I authenticate my identity from azure databricks to onelake.
Please advise.
Thanks
I believe you created the cluster by enabling the Enable credential passthrough for user-level data access option in Advanced options.
Next, you need to give ABFS path
instead of https
.
To get ABFS path
follow below steps.
Go to your workspace.
Then select your lakehouse.
There click menu option on Files and select properties
you will get below paths.
Here, you copy ABFS path
.
Below is the data i am writing.
and writing it to onelake successfully.
oneLake = "abfss://Your_workspace@msit-onelake.dfs.fabric.microsoft.com/Your_lake_house/Files/"
df.write.format("csv").option("header", "true").mode("overwrite").csv(oneLake)
Output :
and again reading it from onelake.
display(spark.read.csv(oneLake,header=True))