Search code examples
apache-sparkgradlegradlew

WARN SparkContext: Another SparkContext is being constructed (or threw an exception in its constructor)


I have a two integration tests for my DataFrame transformation code (using https://github.com/holdenk/spark-testing-base ) and they all run fine when run individually in IntelliJ.

However, when I run my gradle build, for the first test I see the following messages:

17/04/06 11:29:02 WARN SparkContext: Another SparkContext is being constructed (or threw an exception in its constructor).  This may indicate an error, since only one SparkContext may be running in this JVM (see SPARK-2243). The other SparkContext was created at:

And:

17/04/06 11:29:05 ERROR SparkContext: Error initializing SparkContext.
akka.actor.InvalidActorNameException: actor name [ExecutorEndpoint] is not unique!

And:

java.lang.NullPointerException
at org.apache.spark.network.netty.NettyBlockTransferService.close(NettyBlockTransferService.scala:152)

The second test runs partway and aborts with the following message (this code runs fine on the actual cluster BTW):

org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.lang.NullPointerException
org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:80)

Here's a pastebin of the full build output: https://pastebin.com/drG20kcB

How do I run my spark integration tests all together?

Thanks!

PS: If it might be relevant, I'm using gradle wrapper (./gradlew clean build)


Solution

  • I needed this:

    test {
      maxParallelForks = 1
    }
    

    However, if there is a way to turn of parallel execution for a specific subset of tests in gradle, I would much prefer that solution.

    I'm using ScalaTest with WordSpec BTW.