Search code examples
sql-serverapache-sparkazure-databricks

How to execute a command: SET IDENTITY_INSERT <table> ON on SQL Server table from Spark/Databricks?


I have been able to read/write from Databricks into SQL Server table using JDBC driver. However this time I have to execute a command before I write to a SQL Server. I need to execute this command on SQL server: SET IDENTITY_INSERT <sqlserver_table_name> ON How to do this from Databricks ? Any help/pointers are appreciated. Thanks.


Solution

  • You can't do this with the JDBC Spark Connector (or the SQL Server Spark Connector), but it's trivial when using JDBC directly in Scala or Java. When using JDBC directly you have explicit control of the session, and you can issue multiple batches in the same session, or multiple statements in the same batch. EG

    %scala
    
    import java.util.Properties
    import java.sql.DriverManager
    
    val jdbcUsername =  dbutils.secrets.get(scope = "kv", key = "sqluser")
    val jdbcPassword = dbutils.secrets.get(scope = "kv", key = "sqlpassword")
    val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    
    // Create the JDBC URL without passing in the user and password parameters.
    val jdbcUrl = s"jdbc:sqlserver://xxxxxx.database.windows.net:1433; . . ."
    
    val connection = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword)
    val stmt = connection.createStatement()
    val sql = """
    
    SET IDENTITY_INSERT <sqlserver_table_name> ON
    
    """
    
    stmt.execute(sql)
    
    //run additional batches here with IDENTITY_INSERT ON
        
    connection.close()
    

    And you can always use the Spark Connector to load a staging table, then use JDBC to run a stored procedure or ad-hoc SQL batch to load the staging data into the target table.