Using the following link to load data to SQL DB from Databricks I'm getting the following error:
command-3227900948916301:23: error: value bulkCopyToSqlDB is not a member of org.apache.spark.sql.DataFrameWriter[org.apache.spark.sql.Row] df.write.mode(SaveMode.Overwrite).bulkCopyToSqlDB(bulkCopyConfig)
My code is as follows:
val bulkCopyConfig = Config(Map(
"url" -> url,
"databaseName" -> databaseName,
"dbTable" -> "dbo.xxxx",
"user" -> user,
"password" -> password,
"connectTimeout" -> "120",
"bulkCopyBatchSize" -> "100000",
"bulkCopyTableLock" -> "true",
"bulkCopyTimeout" -> "0",
"truncate" -> "true"
// "queryTimeout" -> "5"
))
//~
df. write.mode(SaveMode.Overwrite).bulkCopyToSqlDB(bulkCopyConfig)
Any thoughts on why I'm getting the error?
You need to have correct imports to extend the DataFrame with additional functions:
import com.microsoft.azure.sqldb.spark.connect._
here is what I get (it fails because I don't have active SQL DB, but it found the function):
scala> import com.microsoft.azure.sqldb.spark.config.Config
import com.microsoft.azure.sqldb.spark.config.Config
scala> import com.microsoft.azure.sqldb.spark.connect._
import com.microsoft.azure.sqldb.spark.connect._
scala> val bulkCopyConfig = Config(Map(
| "url" -> "mysqlserver.database.windows.net",
| "databaseName" -> "MyDatabase",
| "user" -> "username",
| "password" -> "*********",
| "databaseName" -> "MyDatabase",
| "dbTable" -> "dbo.Clients",
| "bulkCopyBatchSize" -> "2500",
| "bulkCopyTableLock" -> "true",
| "bulkCopyTimeout" -> "600"
| ))
bulkCopyConfig: com.microsoft.azure.sqldb.spark.config.Config = com.microsoft.azure.sqldb.spark.config.ConfigBuilder$$anon$1@754443e1
scala> val df = spark.range(1,10)
df: org.apache.spark.sql.Dataset[Long] = [id: bigint]
scala> df.bulkCopyToSqlDB(bulkCopyConfig)
20/10/06 16:24:38 ERROR DataFrameFunctions: Connection cannot be established to the database
com.microsoft.sqlserver.jdbc.SQLServerException: Login failed for user 'username'. ClientConnectionId:b35aa6bb-24ec-4727-98b6-074cae4e6a32
...