Search code examples
apache-sparkspark-connectapache-spark-connector

Spark connect client failing with java.lang.NoClassDefFoundError


java: 1.8,sbt: 1.9,scala: 2.12

I have a very simple repo with the following dependency in build.sbt

libraryDependencies ++= Seq("org.apache.spark" %% "spark-connect-client-jvm" % "3.5.0")

A simple application

object Main extends App {
   val s = SparkSession.builder().remote("sc://localhost").getOrCreate()
   s.read.json("/tmp/input.json").repartition(10).show(false)
}

But when I run it, I get the following error

Exception in thread "main" java.lang.NoClassDefFoundError: org/sparkproject/connect/client/com/google/common/cache/CacheLoader
    at Main$.delayedEndpoint$Main$1(Main.scala:4)
    at Main$delayedInit$body.apply(Main.scala:3)
    at scala.Function0.apply$mcV$sp(Function0.scala:39)
    at scala.Function0.apply$mcV$sp$(Function0.scala:39)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
    at scala.App.$anonfun$main$1$adapted(App.scala:80)
    at scala.collection.immutable.List.foreach(List.scala:431)
    at scala.App.main(App.scala:80)
    at scala.App.main$(App.scala:78)
    at Main$.main(Main.scala:3)
    at Main.main(Main.scala)
Caused by: java.lang.ClassNotFoundException: org.sparkproject.connect.client.com.google.common.cache.CacheLoader
    at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    ... 11 more

I know the connect does a bunch of shading during assembly so it could be related to that. This application is not started via spark-submit or anything. It's not run neither under a SPARK_HOME ( I guess that's the whole point of connect client )

I followed the doc exactly as described. Can somebody help?


Solution

  • This is definitely an issue with shading, this was probably introduced in the recent dependency rework. My apologies for the poor experience. I have filed https://issues.apache.org/jira/browse/SPARK-45371 to track this on our end. I will keep you posted.