Being new to Datawarehouse, I have a new requirement to create EXTERNAL TABLE from Datalake(GEN1/GEN2) into DWH from Databricks. I used the link to create the below code.
// Set up the Blob storage account access key in the notebook session conf.
spark.conf.set(
"fs.azure.account.key.<your-storage-account-name>.blob.core.windows.net",
"<your-storage-account-access-key>")
// Get some data from a SQL DW table.
val df: DataFrame = spark.read
.format("com.databricks.spark.sqldw")
.option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>")
.option("tempDir", "wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net/<your-directory-name>")
.option("forwardSparkAzureStorageCredentials", "true")
.option("dbTable", "my_table_in_dw")
.load()
The code that I have written
%scala
Class.forName("com.databricks.spark.sqldw.DefaultSource")
import org.apache.spark.sql.functions._
import org.apache.spark.sql.{DataFrame, SQLContext}
spark.conf.set("fs.azure.account.key.xxxxxxxxx.blob.core.windows.net", "xxxxxxxxxxxxxxx")
// Load data from a SQL DW query
val df: DataFrame = spark.read
.format("com.databricks.spark.sqldw")
.option("url", "jdbc:sqlserver://xxxxxxxxxxx.database.windows.net:1433;database=xxxxxxxx")
.option("tempDir", "wasbs://[email protected]")
.option("forwardSparkAzureStorageCredentials", "true")
.option("dbTable", "dbo.EXT_TEST")
.load()
This is throwing an error : com.databricks.spark.sqldw.SqlDWConnectorException: Exception encountered in SQL DW connector code. Where am I going wrong? Any help would be apreciated.
Make sure to pass the "tempDir" as shown in the below format.
tempDir = "wasbs://" + blobContainer + "@" + blobStorage +"/tempDirs"
Reference: Load data into Azure SQL Data Warehouse
You may refer the suggestions outlined on GitHub issue, which addresses similar issue.
Hope this helps.