I've installed Spark inside a folder in my home directory and added that to my .bash_profile. From the terminal, I can run pyspark
or spark-shell
after source ~/.bash_profile
. But for Sparklyr, the default spark location is inside the user folder. Is there a way to permanently change the default location or setting up a path variable without having to configure it everytime I run a new R session?
When I try to connect spark declaring the location spark is installed, I get the following error message:
sc <- spark_connect(master = "local", spark_home = "~/server/spark/")
`Error: Java 11 is only supported for Spark 3.0.0+
Is there a way to permanently configure java_home for sparklyr as well? I haven't found anything about this in the documentation.
Thanks!
I'm using Mac OS Catalina 10.15.4, RStudio Version 1.2.5033, Spark version 2.4.5
I did this using two steps:
I got the appropriate Java home by running /usr/libexec/java_home -v 1.8
in the terminal (this should also already be set in the bash profile, more details here)
I added a JAVA_HOME (and SPARK_HOME) variable to my .Renviron file so that I wouldn't have to set it for each session. I used usethis::edit_r_environ()
to open the file and restarted my R session for it to take effect. (More details on .Renviron generally here).