Search code examples
azure-databricksazure-cosmosdb-tables

Unable to connect cosmos table api from databricks throws errror


Loaded proper library at cluster level. com.microsoft.azure:azure-cosmosdb-spark_2.4.0_2.11:3.7.0

Gave proper connection strings from cosmos table api

    cosmosConfig = {
  "Endpoint" : "https://cosmos-account-name.table.cosmos.azure.com:443/",
  "Masterkey" : "PrimaryKey",
  "Database" : "TablesDB",
  "Collection" : "Deals_Metadata"
}

Started reading this using spark api.

cosmosdbConnection = spark.read.format("com.microsoft.azure.cosmosdb.spark").options(**cosmosConfig).load()

when i execute this throws below error.

java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;


Solution

  • I tried to reproduce same in my environment I got same error.

    Ref1

    To resolve this error, try to check com.azure.cosmos.spark jar properly installed or not and also follow below code.

    Endpoint = "https://xxxx.documents.azure.com:443/"
    MasterKey = "cosmos_db_key"
    DatabaseName = "<dbname>"
    ContainerName = "container"
    
    spark.conf.set("spark.sql.catalog.cosmosCatalog", "com.azure.cosmos.spark.CosmosCatalog")
    spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountEndpoint", Endpoint)
    spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountKey", MasterKey)
    
    spark.sql("CREATE DATABASE IF NOT EXISTS cosmosCatalog.{};".format(DatabaseName))
    spark.sql("CREATE TABLE IF NOT EXISTS cosmosCatalog.{}.{} using cosmos.oltp TBLPROPERTIES(partitionKeyPath = '/id', manualThroughput = '1100')".format(DatabaseName, ContainerName))
    

    Ref2

    Reading the data into spark Dataframe

    Cfg1 = {
      "spark.cosmos.accountEndpoint": Endpoint,
      "spark.cosmos.accountKey": MasterKey,
      "spark.cosmos.database": DatabaseName,
      "spark.cosmos.container": ContainerName,
      "spark.cosmos.read.inferSchema.enabled" : "false"
    }
    
    df = spark.read.format("cosmos.oltp").options(**Cfg1).load()
    print(df.count())
    

    Ref3

    Reference :

    Manage data with Azure Cosmos DB Spark 3 OLTP Connector for SQL API | Microsoft