Search code examples
rapache-sparkrstudioibm-watsonsparklyr

RStudio Connection to Spark on IBM Watson Studio


I'm trying to connect to Spark from an RStudio instance on IBM Watson Studio but I'm getting the following error.

    No encoding supplied: defaulting to UTF-8. Error in force(code) : 
    Failed during initialize_connection: attempt to use zero-length 
    variable name
    Log: /tmp/Rtmpdee7QC/file1b33141066_spark.log


    ---- Output Log ----
    hummingbird kernel
    http://localhost:8081/apsrstudio/agent/v1/kernel/hb-connect ; Time 
    Diff :1.31352798938751
    {"code": "import sparklyr._"} ; Time Diff :0.00552034378051758

Here's the code I'm using to create the connection:

    kernels <- load_spark_kernels()
    sc <- spark_connect(config = kernels[1])

Any help would be highly appreciated!


Solution

  • I was able to fix this issue! Seems like I was missing a Project Access Token. Project access tokens can be manually created as described here. Tokens can be created on the Settings page of your project. From the link shared above:

    Create an access token on the Settings page of your project. Only project admins can create access tokens. The access token can have viewer or editor access permissions. Only editors can inject the token into a notebook.

    After adding a project access token, I could connect to Spark using the code provided in the question with no problems.

    kernels <- load_spark_kernels()
    sc <- spark_connect(config = kernels[1])