Search code examples
rapache-sparksparkr

Error while installing SparkR package using install_github


I am trying to use the SparkR package in R. I have all dependent packages like devtools, Rtools.exe, etc.

When I try the following command:

install_github("amplab-extras/SparkR-pkg",subdir="pkg")

I get the following error:

Downloading github repo amplab-extras/SparkR-pkg@master
Error in function (type, msg, asError = TRUE ) :
  Received HTTP code 403 from proxy after CONNECT

To solve this I have set a working http_proxy, https_proxy but it is not working and throws above error. I am new to R/RStudio.


Solution

  • I have installed SparkR on Windows 7, 64 bit with R-3.2.x and having Spark 1.4 installed on it.

    ** If you need to know about installing Spark on Windows, please check official documentation of Spark or step wise process listed here.

    • Go to bin folder of maven

      C:\Program Files\apache-maven-3.3.3\bin

    • Open notepad and paste the text

      "%~dp0\mvn.cmd" %*

    • Save the notepad in bin folder as mvn.bat as shown below

      C:\Program Files\apache-maven-3.3.3\bin\mvn.bat

    • Restart Rstudio and execute

      library(devtools) install_github("repo/SparkR-pkg", ref="branchname", subdir="pkg")