I'm trying to run a notebook on Analytics for Apache Spark running on Bluemix, but I hit the following error:
Exception: ("You must build Spark with Hive. Export 'SPARK_HIVE=true' and
run build/sbt assembly", Py4JJavaError(u'An error occurred while calling
None.org.apache.spark.sql.hive.HiveContext.\n', JavaObject id=o38))
The error is intermittent - it doesn't always happen. The line of code in question is:
df = sqlContext.read.format('jdbc').options(
url=url,
driver='com.ibm.db2.jcc.DB2Driver',
dbtable='SAMPLE.ASSETDATA'
).load()
There are a few similar questions on stackoverflow, but they aren't asking about the spark service on bluemix.
Create a new SQLContext
object before using sqlContext
:
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
and then run the code again.
This error happens if you have multiple notebooks using the out of box sqlContext
.