Loaded proper library at cluster level. com.microsoft.azure:azure-cosmosdb-spark_2.4.0_2.11:3.7.0
Gave proper connection strings from cosmos table api
cosmosConfig = {
"Endpoint" : "https://cosmos-account-name.table.cosmos.azure.com:443/",
"Masterkey" : "PrimaryKey",
"Database" : "TablesDB",
"Collection" : "Deals_Metadata"
}
Started reading this using spark api.
cosmosdbConnection = spark.read.format("com.microsoft.azure.cosmosdb.spark").options(**cosmosConfig).load()
when i execute this throws below error.
java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
I tried to reproduce same in my environment I got same error.
To resolve this error, try to check com.azure.cosmos.spark
jar properly installed or not and also follow below code.
Endpoint = "https://xxxx.documents.azure.com:443/"
MasterKey = "cosmos_db_key"
DatabaseName = "<dbname>"
ContainerName = "container"
spark.conf.set("spark.sql.catalog.cosmosCatalog", "com.azure.cosmos.spark.CosmosCatalog")
spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountEndpoint", Endpoint)
spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountKey", MasterKey)
spark.sql("CREATE DATABASE IF NOT EXISTS cosmosCatalog.{};".format(DatabaseName))
spark.sql("CREATE TABLE IF NOT EXISTS cosmosCatalog.{}.{} using cosmos.oltp TBLPROPERTIES(partitionKeyPath = '/id', manualThroughput = '1100')".format(DatabaseName, ContainerName))
Reading the data into spark Dataframe
Cfg1 = {
"spark.cosmos.accountEndpoint": Endpoint,
"spark.cosmos.accountKey": MasterKey,
"spark.cosmos.database": DatabaseName,
"spark.cosmos.container": ContainerName,
"spark.cosmos.read.inferSchema.enabled" : "false"
}
df = spark.read.format("cosmos.oltp").options(**Cfg1).load()
print(df.count())
Reference :
Manage data with Azure Cosmos DB Spark 3 OLTP Connector for SQL API | Microsoft