I'm very happy with the easy Installation of sparklyr.
spark_install(version = "2.1.0", hadoop_version = "2.7")
Installation complete.
But after the installtion I want to connect spark I got the following error message. The Folder C:/spark doesn't exist, because RStudio installed the order for spark under my User.
> sc <- spark_connect(master = "local")
Created default hadoop bin directory under: C:\spark\tmp\hadoop Error in spark_version_from_home(spark_home, default = spark_version) :
Failed to detect version from SPARK_HOME or SPARK_HOME_VERSION. Try passing the spark version explicitly. In addition: Warning messages: 1: In dir.create(hivePath, recursive = TRUE) : cannot create dir 'C:\spark', reason 'Permission denied' 2: In dir.create(hadoopBinPath, recursive = TRUE) : cannot create dir 'C:\spark', reason 'Permission denied' 3: In file.create(to[okay]) : cannot create file 'C:\spark\tmp\hadoop\bin\winutils.exe', reason 'No such file or directory' 4: running command '"C:\spark\tmp\hadoop\bin\winutils.exe" chmod 777 "C:\spark\tmp\hive"' had status 127 >
Someone know a solution?
EDIT:
I have copy the Folder to C:/spark and now it works. But I get the following error message:
Created default hadoop bin directory under: C:\spark\tmp\hadoop Error in start_shell(master = master, spark_home = spark_home, spark_version = version, : sparklyr does not currently support Spark version: 2.1.0
But this Version is listet under: spark_available_versions()
Which Version is the newest I can install?
I habe installed this Version and all works fine:
`spark_install(version = "2.0.0", hadoop_version = "2.6")