Search code examples
mongodbapache-sparkazure-cosmosdbazure-databricks

Databricks connection to Cosmos DB Mongo API


I am trying to connect to Cosmos DB Mongo API from Databricks and I get the error,

java.lang.IllegalStateException: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Invalid JSON String: ''

Option 1:

data = spark.read.format("com.microsoft.azure.cosmosdb.spark").option("Endpoint", "https://cosmosdb-myendpoint.com:443/").option("Masterkey", "primary key of the account").option("Database", "sample").option("Collection", "sample1").load()

Option 2:

cosmosConfig = {
  "Endpoint" : "https://cosmosdb-myendpoint.com:443/",
  "Masterkey" : "primary key of the account",
  "Database" : "sample",
  "Collection" : "sample1"
}

cosmosdbConnection = spark.read.format("com.microsoft.azure.cosmosdb.spark").options(**cosmosConfig).load()

Both these options give the same invalid JSON string error. I've already installed the library on the cluster.


Solution

  • Yes I had installed the SQL API connector. I got it working through the Spark MongoDB connector available through Maven. Maven coordinates: org.mongodb.spark:mongo-spark-connector_2.11:2.3.1