Search code examples
pysparkapache-spark-sqldatabricksspark-streaming

spark.read.format("jdbc") works in Databricks Runtime 10.4 but not in Runtime 12.2 with spark.read.format("sqlserver")


This is how I have successfully been reading data from an SQL server in Databricks Runtime 10.4, using the legacy JDBC driver:

remote_table = (
  spark.read.format("jdbc")
    .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
    .option("url", "jdbc:sqlserver://mydomainname.com;database=Runtime")
    .option("dbtable", "Runtime.dbo.History")
    .option("user", "myusername")
    .option("password", "mypassword")
    .load()
)

I want to use the "newer" approach with Runtime 12.2:

remote_table = (spark.read
  .format("sqlserver")
  .option("host", "mydomainname.com")
  .option("user", "myusername")
  .option("password", "mypassword")
  .option("database", "Runtime")
  .option("dbtable", "Runtime.dbo.History")
  .load()
)

But an error is returned:

com.microsoft.sqlserver.jdbc.SQLServerException: The driver could not establish a secure connection to SQL Server by using Secure Sockets Layer (SSL) encryption. Error: "PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target". ClientConnectionId:56f3b77d-a90a-4c89-98be-88924f364897

And yes, I am remembering to switch the cluster to one with Runtime 12.2.


Solution

  • "sqlserver" connector is just a wrapper over JDBC and you would encounter same issue on runtime 12.x if you attempted legacy JDBC connection.

    While Databricks runtime 10.x used SQL Server JDBC driver version 9.x, there was a breaking change in version 10.2 (which is used in 12.x runtime) that enabled TLS encryption by default and forced certificate validation. For this similar issue answered here. Extra info in SQL Server docs section Understanding encryption support.

    In short, add following options:

    .option("encrypt", "false")
    

    or

    .option("encrypt", "true")
    .option("trustServerCertificate", "true")