I am using the Databricks VSCode extension for development in an IDE. The basic functionalities are all working well. I connected to an Azure Databricks workspace with Unity Catalog enabled, selected an active cluster (DBR 13.2) and configured the sync destination. I am able to execute code. Now I want to use Databricks Connect "V2" to run my code locally.
I have the following code:
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
However, when I run this, I get the following error:
RuntimeError: Only remote Spark sessions using Databricks Connect are supported. Could not find connection parameters to start a Spark remote session.
Am I missing something? I did my authentication once with the AZ CLI, once with a PAT. I also tried it on DBR 13.2 and 13.3, but all options failed.
Thanks!
Ok, that issue was fixed in the extension version 1.1.1 by exporting the SPARK_REMOTE
environment variables that is needed for spark = SparkSession.builder.getOrCreate()
to work.
But please note that it will work only if you configure profile-based authentication, not for azure-cli
or OAuth authentication - for them to work you need to instantiate the DatabricksSession
instance that could be imported with from databricks.connect import DatabricksSession