Search code examples
rprestotrinorstudio-serverstarburst

How to execute SQL on R workbench via a starburst connector?


What: Need to connect fetch data from a db using RStudio workbench and Trino.

Issue: Always errors out. Connection to presto host using the same method works fine. Trino driver version is 443 (which is latest at the time of writing)

Constraints: Has to be a JDBC connection. This is an enterprise environment so have limited ability to test or influence changes to RStudio, however some package install etc can be influenced.

What do I need: Guidance on what is wrong with my code? If the issue is not my code, then what needs to change in RStudio workbench to make it work.

Code:

dyn.load(
  '/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.402.b06-1.amzn2.0.1.x86_64/jre/lib/amd64/server/libjvm.so'
)

library(DBI)
library(RJDBC)

user <<- Sys.info()[['user']]
pwd <- rstudioapi::askForPassword(prompt = "Enter SSO Password")

home <- paste0("/mnt/efs/users/",user)
cert_path <- "/connections/starburst/"
certificate <- "SHA2-CA-Bundle.cer"
certificate2 <- "sep.jks"
jar_files = c(list.files(paste0(home,cert_path),
                         pattern="jar$",
                         full.names=T)
)

# driver_class <- "io.trino.jdbc.NonRegisteringTrinoDriver"
driver_class <- "io.trino.jdbc.TrinoDriver"

drv <- RJDBC::JDBC(driverClass = driver_class,
                   classPath = jar_files)
host <- 
  "jdbc:trino://myhost.com:111/hive"

server <-
  paste0(
    host,
    "?user=", user,
    "&password=", pwd,
    # "&SSL=true",
    "&SSLVerification=CA",
    # "&SSLTrustStorePath=", home, cert_path, certificate,
    "&SSLKeyStorePath=",home, cert_path, certificate2
  )

server

conn <<- DBI::dbConnect(drv, server)

query_input <- "SELECT count(*) FROM some.table"

output <- dbGetQuery(conn, query_input)

View(output)

dbDisconnect(conn)

Error:

Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set",  : 
  Unable to retrieve JDBC result set
  JDBC ERROR: Error executing query: Error fetching next (attempts: 4, duration: 2.01m)
  Statement: SELECT count(*) FROM some.table

Solution

  • After a lot of hit and trial I was able to solve this on my own. Posting the answer here for reference.

    library(DBI)
    library(RJDBC)
    
    user <<- Sys.info()[['user']]
    pwd <- rstudioapi::askForPassword(prompt = "Enter SSO Password")
    
    home <- paste0("/mnt/efs/users/",user)
    cert_path <- "/connections/starburst/"
    certificate <- "SHA2-CA-Bundle.cer"
    certificate2 <- "sep.jks"
    jar_files = c(list.files(paste0(home,cert_path),
                             pattern="jar$",
                             full.names=T)
    )
    
    # driver_class <- "io.trino.jdbc.NonRegisteringTrinoDriver"
    driver_class <- "io.trino.jdbc.TrinoDriver"
    
    drv <- RJDBC::JDBC(driverClass = driver_class,
                       classPath = jar_files)
    host <- 
      "jdbc:trino://myhost.com:111/hive"
    
    - server <-
    -  paste0(
    -    host,
    -    "?user=", user,
    -    "&password=", pwd,
    -    # "&SSL=true",
    -    "&SSLVerification=CA",
    -    # "&SSLTrustStorePath=", home, cert_path, certificate,
    -    "&SSLKeyStorePath=",home, cert_path, certificate2
    -  )
    
    - server
    
    + conn <<- DBI::dbConnect(drv, host, user, pwd)
    
    query_input <- "SELECT count(*) FROM some.table"
    
    output <- dbGetQuery(conn, query_input)
    
    View(output)
    
    dbDisconnect(conn)