Search code examples
scalaapache-sparkspark-streamingscalatest

Spark Unit Testing: How to initialize sc only once for all the Suites using FunSuite


I want to write spark unit test cases and I am using FunSuite for it. But i want that my sparkContext is initialized only once , used by all the Suites and then is killed when all Suites completes.

abstract class baseClass extends FunSuite with BeforeAndAfter{
  before {
    println("initialize spark context")
  }
  after {
    println("kill spark context")
  }

}



@RunWith(classOf[JUnitRunner])
class A extends baseClass{
test("for class A"){
//assert
}

@RunWith(classOf[JUnitRunner])
class B extends baseClass{
test(for class b){
//assert
}
}

but when i run sbt test I can see println statement baseClass has been called from both the tests. Obsiously When the object is created for both the classes A and B , Abstract baseclass is called. But then how can we achieve my purpose i.e spark context is iniliazed only once while all the test cases are run


Solution

  • If you really want to share the context between suites - you'll have to make it static. Then you can use a lazy value to make it start on first use. As for shutting it down - you can leave it to the automatic Shutdown hook created each time a context is created.

    It would look something like:

    abstract class SparkSuiteBase extends FunSuite {
        lazy val sparkContext = SparkSuiteBase.sparkContext
    }
    
    // putting the Spark Context inside an object allows reusing it between tests
    object SparkSuiteBase {
        private lazy val sparkContext = ??? // create the context here
    }