Search code examples
apache-sparkthriftapache-spark-sqlapache-spark-1.4

In Apache Spark SQL, How to close metastore connection from HiveContext


My project has unit tests for different HiveContext configurations (sometimes they are in one file as they are grouped by features.)

After upgrading to Spark 1.4 I encounter a lot of 'java.sql.SQLException: Another instance of Derby may have already booted the database' problems, as a patch make those contexts unable to share the same metastore. Since its not clean to revert state of a singleton for every test. My only option boils down to "recycle" each context by terminating the previous Derby metastore connection. Is there a way to do this?


Solution

  • Well in scala I just used FunSuite for Unit Tests together with BeforeAndAfterAll trait. Then you can just init your sparkContext in beforeAll, spawn your HiveContext from it and finish it like this:

      override def afterAll(): Unit = {
        if(sparkContext != null)
          sparkContext .stop()
      }
    

    From what I've noticed it also closes a HiveContext attached to it.