I'm reading data in batch from a Cassandra database & also in streaming from Azure EventHubs using Scala Spark API.
session.read
.format("org.apache.spark.sql.cassandra")
.option("keyspace", keyspace)
.option("table", table)
.option("pushdown", pushdown)
.load()
&
session.readStream
.format("eventhubs")
.options(eventHubsConf.toMap)
.load()
Everything was running fine, but now I get this exception out frow nowhere...
User class threw exception: java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.catalog.SessionCatalog.<init>(Lscala/Function0;Lscala/Function0;Lorg/apache/spark/sql/catalyst/analysis/FunctionRegistry;Lorg/apache/spark/sql/internal/SQLConf;Lorg/apache/hadoop/conf/Configuration;Lorg/apache/spark/sql/catalyst/parser/ParserInterface;Lorg/apache/spark/sql/catalyst/catalog/FunctionResourceLoader;)V
at org.apache.spark.sql.internal.BaseSessionStateBuilder.catalog$lzycompute(BaseSessionStateBuilder.scala:132)
at org.apache.spark.sql.internal.BaseSessionStateBuilder.catalog(BaseSessionStateBuilder.scala:131)
at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$1.<init>(BaseSessionStateBuilder.scala:157)
at org.apache.spark.sql.internal.BaseSessionStateBuilder.analyzer(BaseSessionStateBuilder.scala:157)
at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$2.apply(BaseSessionStateBuilder.scala:293)
at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$2.apply(BaseSessionStateBuilder.scala:293)
at org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:79)
at org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:79)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:74)
at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:428)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:233)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
I don't know what changed exactly but here is my dependencies :
ThisBuild / scalaVersion := "2.11.11"
val sparkVersion = "2.4.0"
libraryDependencies ++= Seq(
"org.apache.logging.log4j" % "log4j-core" % "2.11.1",
"org.apache.spark" %% "spark-core" % sparkVersion % "provided",
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
"org.apache.spark" %% "spark-hive" % sparkVersion % "provided",
"org.apache.spark" %% "spark-catalyst" % sparkVersion % "provided",
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
"com.microsoft.azure" % "azure-eventhubs-spark_2.11" % "2.3.10",
"com.microsoft.azure" % "azure-eventhubs" % "2.3.0",
"com.datastax.spark" %% "spark-cassandra-connector" % "2.4.1",
"org.scala-lang.modules" %% "scala-java8-compat" % "0.9.0",
"com.twitter" % "jsr166e" % "1.1.0",
"com.holdenkarau" %% "spark-testing-base" % "2.4.0_0.12.0" % Test,
"MrPowers" % "spark-fast-tests" % "0.19.2-s_2.11" % Test
)
Anyone have a clue ?
java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.catalog.SessionCatalog.<init(
scala/Function0;Lscala/Function0;
Lorg/apache/spark/sql/catalyst/analysis/FunctionRegistry;
Lorg/apache/spark/sql/internal/SQLConf;
Lorg/apache/hadoop/conf/Configuration;
Lorg/apache/spark/sql/catalyst/parser/ParserInterface;
Lorg/apache/spark/sql/catalyst/catalog/FunctionResourceLoader;)
Suggests to me that one of the ilbraries was compiled against a version of Spark that is different than the one that is currently on the runtime path. Since the above method signature does match the Spark 2.4.0 signature see
But not the Spark 2.3.0 Signature.
My guess would be there is a runtime Spark 2.3.0 somewhere? Perhaps you are running the application using Spark-Submit from a Spark 2.3.0 install?