Search code examples
rapache-sparksparklyr

sparklyr - error after installation


I'm very happy with the easy Installation of sparklyr.

spark_install(version = "2.1.0", hadoop_version = "2.7")

Installation complete.

But after the installtion I want to connect spark I got the following error message. The Folder C:/spark doesn't exist, because RStudio installed the order for spark under my User.

 > sc <- spark_connect(master = "local")

Created default hadoop bin directory under: C:\spark\tmp\hadoop Error in spark_version_from_home(spark_home, default = spark_version) :
Failed to detect version from SPARK_HOME or SPARK_HOME_VERSION. Try passing the spark version explicitly. In addition: Warning messages: 1: In dir.create(hivePath, recursive = TRUE) : cannot create dir 'C:\spark', reason 'Permission denied' 2: In dir.create(hadoopBinPath, recursive = TRUE) : cannot create dir 'C:\spark', reason 'Permission denied' 3: In file.create(to[okay]) : cannot create file 'C:\spark\tmp\hadoop\bin\winutils.exe', reason 'No such file or directory' 4: running command '"C:\spark\tmp\hadoop\bin\winutils.exe" chmod 777 "C:\spark\tmp\hive"' had status 127 >

Someone know a solution?

EDIT:

I have copy the Folder to C:/spark and now it works. But I get the following error message:

Created default hadoop bin directory under: C:\spark\tmp\hadoop Error in start_shell(master = master, spark_home = spark_home, spark_version = version, : sparklyr does not currently support Spark version: 2.1.0

But this Version is listet under: spark_available_versions()

Which Version is the newest I can install?


Solution

  • I habe installed this Version and all works fine:

    `spark_install(version = "2.0.0", hadoop_version = "2.6")