In bluemix spark I want to use HiveContext
HqlContext = HiveContext(sc)
//some code
df = HqlContext.read.parquet("swift://notebook.spark/file.parquet")
I get following error
Py4JJavaError: An error occurred while calling o45.parquet. : java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
The HiveContext is not included by default in the Bluemix Spark offering.
To include it in your notebook, you should be able to use %AddJar to load it from a publicly accessible server, e.g.:
%AddJar http://my.server.com/jars/spark-hive_2.10-1.5.2.jar
You can also point this at Maven's repository link:
%AddJar http://repo1.maven.org/maven2/org/apache/spark/spark-hive_2.10/1.5.2/spark-hive_2.10-1.5.2.jar
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
This works to enable the Hive Context for me.
Now, it's worth noting that the latest available versions on Maven probably don't line up with the current version of Spark running on Bluemix, so my suggestion is to check the version of Spark on Bluemix by using:
sc.version
Then match the version of this JAR to that version of Spark.