Search code examples
rsparklyr

Error when connecting sparklyr to local


I'm trying to run sparklyr from my local environment to replicate a production environment. However, I can't even get started. I successfully installed the latest version of Spark using spark_install(), but when trying to run spark_connect() I get this vague and unhelpful error.

> library(sparklyr)

> spark_installed_versions()
  spark hadoop                                                                 dir
1 2.3.1    2.7 C:\\Users\\...\\AppData\\Local/spark/spark-2.3.1-bin-hadoop2.7

> spark_connect(master = "local")
Error in if (is.na(a)) return(-1L) : argument is of length zero

Here is what my session info looks like.

> sessionInfo()

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sparklyr_0.8.4.9003

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.17     dbplyr_1.2.1     compiler_3.5.0   pillar_1.2.3     later_0.7.3     
 [6] plyr_1.8.4       bindr_0.1.1      base64enc_0.1-3  tools_3.5.0      digest_0.6.15   
[11] jsonlite_1.5     tibble_1.4.2     nlme_3.1-137     lattice_0.20-35  pkgconfig_2.0.1 
[16] rlang_0.2.1      psych_1.8.4      shiny_1.1.0      DBI_1.0.0        rstudioapi_0.7  
[21] yaml_2.1.19      parallel_3.5.0   bindrcpp_0.2.2   stringr_1.3.1    dplyr_0.7.5     
[26] httr_1.3.1       rappdirs_0.3.1   rprojroot_1.3-2  grid_3.5.0       tidyselect_0.2.4
[31] glue_1.2.0       R6_2.2.2         foreign_0.8-70   reshape2_1.4.3   purrr_0.2.5     
[36] tidyr_0.8.1      magrittr_1.5     backports_1.1.2  promises_1.0.1   htmltools_0.3.6 
[41] assertthat_0.2.0 mnormt_1.5-5     mime_0.5         xtable_1.8-2     httpuv_1.4.3    
[46] config_0.3       stringi_1.1.7    lazyeval_0.2.1   broom_0.4.4   

Solution

  • Well, with a bit of guessing I was able to solve my problem. I had to specify the "SPARK_HOME" environment manually.

    spark_installed_versions()[1, 3] %>% spark_home_set()