I'm trying to use Spark localy in my machine and I was able to reproduce the tutorial at:
However, when I try to use Hive I get the following error:
Error in value[3L] : Spark SQL is not built with Hive support
The code:
## Set Environment variables
Sys.setenv(SPARK_HOME = 'F:/Spark_build')
# Set the library Path
.libPaths(c(file.path(Sys.getenv('SPARK_HOME'), 'R','lib'),.libPaths()))
# load SparkR
library(SparkR)
sc <- sparkR.init()
sqlContext <- sparkRHive.init(sc)
sparkR.stop()
First I suspected that it was the pre-built version of Spark, then I tried to build my own using Maven, which took almost an hour:
mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -DskipTests clean package.
However, the error persists.
If you just followed the tutorial's instructions, you simply do not have Hive installed (try hive
from the command line)... I have found that this is a common point of confusion for Spark beginners: "pre-built for Hadoop" does not mean that it needs Hadoop, let alone that it includes Hadoop (it does not), and the same holds for Hive.