The following scala code works fine, and the test runs:
import org.scalatest._
import com.holdenkarau.spark.testing._
class DummyTest extends FunSuite with SharedSparkContext {
test("shared context only works inside test functions.") {
val myRDD = sc.parallelize(List(1,2,3,4))
}
}
However, the following scala code results in a java.lang.NullPointerException on the line sc.parallelize:
import org.scalatest._
import com.holdenkarau.spark.testing._
class DummyTest extends FunSuite with SharedSparkContext {
val myRDD = sc.parallelize(List(1,2,3,4))
test("shared context only works inside test functions.") {
assert(true)
}
}
What causes the NullPointerException when the SparkContext is used outside of the test function?
The SparkContext is declared within SharedSparkContext but not initialized as part of that trait's initialization. Rather it is initialized in the trait's beforeAll()
method, which is called by the test framework after the suite has been fully instantiated. Source is here: https://github.com/holdenk/spark-testing-base/blob/master/src/main/pre-2.0/scala/com/holdenkarau/spark/testing/SharedSparkContext.scala. If you use it while initializing your class, beforeAll()
has not yet been called, so it is still null.
So to summarize, the order is:
So you can use sc
in step 4 but not in step 2.