Search code examples
rodbcdbirodbc

ODBC Library in R - getting "Data source name not found and no default driver specified" Error


I have the driver working everything because i am using Power BI perfectly. I am switching to R for some of the calculations and now this error is creeping up

install.packages("odbc")

    library(odbc)

    con<- dbConnect(odbc::odbc(),
                     driver = "[Cloudera ODBC Driver for Apache Hive]",
                     host = "[Confidential]",
                     Schema = "[Confidential]",
                     user = ("Confidential"),
                     password = ("Confidential"),
                     port = 8443)

My ODBC connection is 64 bit

Is there a specific connection string that i need to use here?


Solution

  • There are four issues to consider here:

    1. Originally, the driver had brackets around it, those needed to be removed.

      # original
      con<- dbConnect(odbc::odbc(),
                      driver = "[Cloudera ODBC Driver for Apache Hive]", ...)
      # fixed
      con<- dbConnect(odbc::odbc(),
                      driver = "Cloudera ODBC Driver for Apache Hive", ...)
      
    2. I suspect that the wrong labels user= and password= need to be changed. Per https://db.rstudio.com/databases/hive/, I think they should have been UID= and PWD=, respectively.

    3. Since the connection error suggested connection-encryption (SSL) problems, then further arguments should be provided to DBI::dbConnect to set the correct options. I don't know what they are, to be honest, and I could not find documentation on what they are, so I ignored this problem. On to issue 4 ...

    4. You said that you configured this within your "ODBC Data Source Administrator (64 Bit)", which is a system-wide configuration for named connections, and that you use this configuration for Power BI (meaning that the connection there is good). In your code above, you are not using that system configuration, you are defining it from-scratch. Since the system configuration is known to work, you can use that instead of re-defining, as in

      con <- DBI::dbConnect(odbc::odbc(), dsn = "myhive", UID = "myuser", PWD = "mypass")
      

      where myhive is the name you assigned the configuration within your ODBC data source administration.